System for production of antibodies and their derivatives

ABSTRACT

The present disclosure provides methods and compositions for the production of chimeric antibodies that specifically bind an antigen of interest.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. Provisional Application No.61/733,358, filed Dec. 4, 2012, the content of which is incorporatedherein in its entirety.

TECHNICAL FIELD

The present disclosure relates to methods and compositions for producingchimeric antibodies that specifically bind an antigen of interest.

BACKGROUND

Prior to Sep. 11, 2001 the list of pathogens that humanity wasthreatened by on a day-to-day basis was relatively short and people hadfound means of decreasing the threat from these pathogens by developingcorresponding vaccines. Nowadays this list has swelled many times fromits pre-September 11 size and the threat of exposure of populations toagents from this list has grown immensely. Many vaccines are so old thatthey have lost their potency, while vaccines for other agents simply donot exist. The situation with the anti-BoNT vaccine is a perfect exampleof the former situation. As a result, the traditional vaccinationapproach can no longer be used to the full extent to protect societyfrom such threats.

BoNTs are classified as Category A agents, one of the 6 highest riskthreat agents for bioterrorism (2). These homologous, but serologicallydistinct toxins (serotypes A, B, C, D, E, F and G), specifically targetneurons and, through interruption of neurotransmission, cause muscleparalysis, which leads to death from asphyxiation. It has been estimatedthat aerosol exposure of 100,000 individuals to the toxin, as couldoccur with an aerosol release over a metropolitan area, would result in50,000 cases of illness with 30,000 fatalities (3). Such an exposurewould result in 4.2 million hospital days and an estimated cost of $8.6billion.

Pentavalent botulinum toxoid was generated over 30 years ago viachemical inactivation of native toxins of five different serotypes. Thisvaccine received Investigational New Drug status from the CDC (forat-risk workers), and from the United States Army's Office of theSurgeon General (for military deployment). It was stockpiled and overyears was used more than 20,000 times (4). However, it was also losingits potency over the years and the CDC recently issued a notice of itsdiscontinuation (5). The first reports of efforts to generate a newrecombinant substitute for pentavalent toxoid were published almost 17years ago (6). However, no new anti-BoNT vaccines have been approvedyet. BoNTs of serotypes A and B are currently used under trade namesBOTOX® and MIOBLOCK® in medicine as potent drugs and rejuvenation agentsin cosmetics. Thus, it is unlikely that many people would be willing toundergo vaccination and give up the current benefits of these “miracle”drugs even if new anti-BoNT vaccines were to be developed. A morerealistic strategy for raising preparedness against the threat of abioterrorist attack would include stockpiling pathogen-specificantibodies and using them in case of an immediate threat of bioterroristattack or soon after it.

The injection of heterologous antibodies, however, causes acute ordelayed hypersensitivity reactions in 9% of cases, including serumsickness (3.7%) and anaphylactic shock (1.9%) (7). Further, applicationof non-human antibodies might trigger the development of an immunologicresponse, which will reduce or eliminate the benefit of repeatingapplications of such antibodies. Securing substantial quantities ofhuman antigen-specific serums, however, may be an extremely expensiveendeavor. For example, Orphan Drug human Botulism Immune Globulin hasbeen approved by the FDA for treatment of infant botulism. It wasformulated on the basis of serum obtained from human volunteersvaccinated with pentavalent botulinum toxoid. The price of this drug fortreatment of one patient is $45,300.

SUMMARY

In one aspect, the present disclosure provides a method for producing achimeric immunoglobulin-G (IgG) antibody that specifically binds anantigen of interest comprising: a) isolating nucleic acid sequencesencoding IgG heavy and light chain variable regions from a single immunecell producing an IgG that specifically binds the antigen of interest;b) cloning the nucleic acid sequences of part a) into separateexpression vectors comprising the IgG heavy or light chain constantregions, or into a single expression vector comprising both the IgGheavy and light chain constant regions; c) introducing the expressionvector(s) of part b) into a host cell; d) establishing a stable cellline from the host cell of part c); and e) isolating the IgG produced bythe stable cell line of part d), wherein the method comprisessimultaneous cloning of the IgG heavy and light chain variable regionsisolated from the immune cell of part a), and wherein the expressionvector of part b) allows for (i) unidirectional insertion of the IgGheavy and light chain variable regions into the vector, and (i) positiveselection of expression vectors comprising cloned sequences.

In some embodiments, the antigen of interest is derived from a pathogen.In some embodiments, the antigen of interest is a Clostridium botulinumneurotoxin.

In some embodiments, the expression vector is selected from the groupconsisting of pVLentry-Hyg10, pVHentry-Cm5, pVHentry-GFP1,pVHentry-MLuc7, pVHentry-Hisbio1, and pVHentry-CBD1.

In some embodiments, the stable cell line of part d) is establishedthrough expression of an antibiotic resistance gene present in theexpression vector of part b). In some embodiments, the level ofexpression of the antibiotic resistance gene by the stable cell linecorrelates to the level of IgG production by the stable cell line.

In some embodiments, parts a) and b) comprise the steps of: i)reverse-transcription of mRNA released from the immune cell uponexposure to perfingolysin O; ii) simultaneous amplification of cDNAsproduced in part i) encoding the IgG heavy chain variable region (V_(H))and the IgG light chain variable region (V_(L)); iii) separatere-amplification of the V_(H) and V_(L) sequences of part ii), and iv)insertion of the re-amplified sequences of part iii) into the expressionvector of part b).

In some embodiments, the reverse transcription is performed using aprimer selected from the group consisting of IgG-CHH, Cm1, and Clv-3.

In some embodiments, the simultaneous amplification is performed usingprimers selected from the group consisting of pVk-1, pVk-2, pVk-3,pVk-4, hIgGk-3, IgGH-1, IgGH-2, IgGH-3, IgGH-4, IgGH-5, IgG-CHH, M1, M2,M3, M4, Cm1, V11-5T7, V12-5T7, V13-5T7, V14-5T7, V15-5T7, and C1-3.

In some embodiments, the re-amplification is performed using primersselected from the group consisting of Vk-1/2-5T7, Vk-3-5T7, Vk-4-5T7,hIgGk-3, IgG-CH, Vh-1-3T7, Vh-1-3T75, Vh-1-5T7, Vh-2-5T7, Vh-3-5T7,Vh-4-5T7, Vh-5-5T7, Vh-6-5T7, Vh-7-5T7, Vh-8-5T7, Vh-1-3T75, Vm-1-5T7,Vm-2-5T7, Vm-3-5T7, Vh-1-3T75, Vl1-5T7, Vl2-5T7, Vl3-5T7, Vl4-5T7,Vl5-5T7, and hIgG1-3.

In some embodiments, the method further comprises formulating thechimeric IgG into a therapeutic composition. In some embodiments, themethod further comprises formulating the chimeric IgG into anantigen-specific resin or system for detecting corresponding antigens.

In some embodiments, the immune cell is selected from the groupconsisting of a plasma cell, a B-cell, or any other cell that secretesor displays on the cell surface immunoglobulins specific for the antigenof interest.

In some embodiments, the host cell is selected from the group consistingof a Chinese hamster ovary (CHO) cell, a human embryonic kidney (HEK), amouse NS1/1-Ag 4-1 cell, a NSO/u cell, an X63/Ag 8.653 cell, an SP2/0Ag14 cell, a rat Y3 (210.RCY3.Ag 1.2.3) cell, a YB213.0Ag3 (Y0) cell,and any other mammalian secondary cell line capable of producingimmunoglobulins.

In some embodiments, the method allows for high-throughput production ofantibodies against the antigen of interest.

In one aspect, the present disclosure provides a method for detecting anantigen of interest in a sample, comprising the steps of (a) contactingthe sample with an antibody that specifically binds the antigen underconditions that promote the formation of an antibody-antigen complex,(b) contacting the antibody-antigen complex with a fusion proteincomprising (i) the immunoglobulin-binding domains of staphylococcalprotein A and streptococcal protein G, and (ii) Metridia longaluciferase or a derivative lacking the N-terminal region, underconditions that promote binding of the fusion protein to theantibody-antigen complex, and (c) detecting the Metridia longaluciferase.

In some embodiments, the fusion protein is encoded by a vector selectedfrom the group consisting of pS14L-spAG-MLuc16, pETspAG-ΔN-MLuc1, andpS14L-spAG-ΔN-MLuc15. In some embodiments, the fusion protein is encodedby pS14L-spAG-MLuc16 or pETspAG-ΔN-MLuc1. In some embodiments, thefusion protein is encoded by pS14L-spAG-ΔN-MLuc15.

In one aspect, the present disclosure provides an IgG fusion proteincomprising IgG heavy chains fused with a peptide or polypeptide selectedfrom the group consisting of green fluorescent protein (GFP), Metridialonga luciferase, cellulose binding domain, 6× histidine, or abiotinylatable peptide.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 shows the structure of the pVLentry-Hyg10 and pVHentry-Cm5vectors. Plac and Pamp—bacterial promoters; PCMV ie—the immediate earlypromoter of CMV; IRES—internal ribosome entry site; SV40 poly A and HSVTK polyA—transcription terminators; fl ori and pUC ori—phage and plasmidorigins of replication; 10b, IGHG1, and lacZ′—sequences encoding phageT7 protein 10b, constant part of human IgG and α-peptide ofβ-galactosidase, respectively; Ap(R), CM(R), Km(R) andHygromycin-delEsp—sequences encoding resistance to antibioticsampicillin, chloramphenicol, G418 and Hygromycin B (this sequence wasmodified to remove Esp3I site), respectively. Underlined are sequencesof cohesive ends generated by Esp3I.

FIG. 2 shows the assembly of IgG-encoding sequences using cohesive endsgenerated by DNA polymerase T4. DNApolT4 (dCTP)—designates treatmentwith DNA polymerase T4 in the mixture containing only dCTP. Esp3I andligase—two additional types of treatments with endonuclease Esp3I andDNA ligase, respectively, that are required for assembly of IgG-encodingsequences. IG-V, IGHG1 and 10b—sequences encoding variable and constantparts of IgG chain and protein 10b, respectively.

FIG. 3 shows the interaction of gfpBoNT/A-CH5 with its receptors on thesurface of the neuroblastoma cell. gfpBONT/A-CH5 was added to SH-SY5Ycells and after 15 minutes cells were subject to microscopy.

FIG. 4 shows the effect of antibiotic resistance selection on productionof human IgG by CHO cells. Dilutions of media from the originalIgG-producing culture and its derivative selected at higherconcentrations of antibiotics were loaded into wells of a 96-well platecovered with BoNT/A-CH. Immobilized IgGs were visualized by treatment ofwells with biotinylated anti-human antibodies followed by treatment withstreptavidin-horse radish peroxidase and 1-STEP™ Slow TMB-ELISA (Pierce,Inc.).

FIG. 5 shows the composition of proteins purified from cell culturemedia. Proteins were separated by SDS-PAGE and were either stained byCoomassie (right portion) or transferred onto a nitrocellulose membraneand treated with biotinylated anti-human IgG. Bound antibodies werevisualized by treatment with streptavidin-horse radish peroxidaseconjugate and 1-STEP™ Slow TMB-ELISA (Pierce, Inc.) and 1-STEP™ UltraTMB (Pierce, Inc.). Line 1 contains pre-stained molecular weight markersfrom Fermentas, Inc.; 2—protein purified from media of cells generatedby transfection with plasmid encoding both chains of IgG; 3—protein fromcells transfected with plasmid encoding human IgG whose heavy chain isfused with GFP; 4—protein from cells transfected with plasmid encodinghuman IgG whose heavy chain is fused with MLuc.

FIG. 6 shows the interaction of purified human IgGs withreceptor-recognizing domain of BoNT/A. Dilutions of IgGs purified frommedia of isolated cell cultures were loaded into wells of a 96-wellplate covered with BoNT/A-CH5. Immobilized IgGs were visualized bytreatment of wells with biotinylated anti-human antibodies followed bytreatment with streptavidin-horse radish peroxidase and Metal EnhancedDAB Substrate Kit (Pierce, Inc.). The control line corresponds to thehighest OD₄₅₀ of wells that were treated the same way as others but didnot contain BoNT/A-CH5.

DETAILED DESCRIPTION

The present disclosure provides methods and compositions for robustgeneration of human monoclonal antibodies targeted at pathogens ofinterest.

In addition to the set of products that address existing needs, thistechnology advances our understanding of structure-functionrelationships in the neurotoxin molecule and provides information aboutmechanisms of inactivation of this molecule by antibodies.

In practicing the present disclosure, many conventional techniques incell biology, molecular biology, protein biochemistry, immunology, andbacteriology are used. These techniques are well-known in the art andare provided in any number of available publications, including CurrentProtocols in Molecular Biology, Vols. I-III, Ausubel, Ed. (1997);Sambrook et al., Molecular Cloning: A Laboratory Manual, Second Ed.(Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989).

Certain terms used herein are defined below. Unless defined otherwise,all technical and scientific terms used herein have the same generalmeaning as commonly understood by one skilled in the art.

Unless defined otherwise, all technical and scientific terms used hereingenerally have the same meaning as commonly understood by one ofordinary skill in the art to which this technology belongs. As used inthis specification and the appended claims, the singular forms “a”, “an”and “the” include plural referents unless the content clearly dictatesotherwise. For example, reference to “a cell” includes a combination oftwo or more cells, and the like. Generally, the nomenclature used hereinand the laboratory procedures in cell culture, molecular genetics,organic chemistry, analytical chemistry and nucleic acid chemistry andhybridization described below are those well-known and commonly employedin the art. All references cited herein are incorporated by reference intheir entirety for all purposes to the same extent as if each individualpublication, patent, or patent application were specifically andindividually incorporated by reference in its entirety for all purposes.

As used herein, “about” will be understood by persons of ordinary skillin the art and will vary to some extent depending upon the context inwhich it is used. If there are uses of the term which are not clear topersons of ordinary skill in the art, given the context in which it isused, “about” will mean up to plus or minus 10% of the particular term.

As used herein, “administration” of a composition to a subject includesany route of delivering the compound to the subject to perform itsintended function. Administration can be carried out by any suitableroute including oral, intranasal, parenteral (intravenous,intramuscular, intraperitoneal, or subcutaneous), or topical.Administration includes self-administration and administration byanother.

As used herein, the terms “antigen” and “antigenic” refer to moleculeswith the capacity to be recognized by an antibody or otherwise act as amember of an antibody-ligand pair. “Specific binding” refers to theinteraction of an antigen with the variable regions of immunoglobulinheavy and light chains. Antibody-antigen binding may occur in vivo or invitro. The skilled artisan will understand that macromolecules,including proteins, nucleic acids, fatty acids, lipids,lipopolysaccharides and polysaccharides have the potential to act as anantigen. The skilled artisan will further understand that nucleic acidsencoding a protein with the potential to act as an antibody ligandnecessarily encodes an antigen. The artisan will further understand thatantigens are not limited to full-length proteins, but can also includepartial amino acid sequences. Moreover, sequences from different sourcesmay be combined to generate mosaic antigens, depending on the specificintended use. In some embodiments, the mosaic antigen will includeepitopes derived from different proteins. In some embodiments, themosaic antigen will include epitopes derived from the same protein. Theterm “antigenic” is an adjectival reference to molecules having theproperties of an antigen. In some embodiments, the antigen of interestis a bacterial toxin. In some embodiments the antigen of interest is abotulinum neurotoxin.

As used herein, the term “epitope” refers to that portion of a moleculethat forms a site specifically recognized by an antibody or immune cell.A protein epitope may comprise amino acid residues directly involved inantibody binding, as well as residues not directly involved in bindingthat are nonetheless included in the antibody-epitope footprint andexcluded from the solvent surface. Epitopes may derive from a variety ofphysical characteristics of a protein, including primary, secondary, andtertiary amino acid structure, and amino acid/protein charge. Epitopespresent within a molecule are referred to as “real epitopes.” Realepitopes encompass wild-type sequences and variants of wild-typesequences. Real epitopes may exist within a wild-type protein, anaturally occurring variant of a wild-type protein, or an engineeredvariant of a wild-type protein. The term “mimetic epitope” refers to amolecule whose primary structure is unrelated to the primary structureof a given real epitope that nonetheless specifically binds toantibodies that recognize the real epitope. Epitopes may be isolated,purified, or otherwise prepared by those skilled in the art. They may beobtained from natural sources including cells and tissues, or they maybe isolated from host cells expressing a recombinant form of theepitope.

As used herein, “effective amount” refers to a quantity sufficient toachieve a desired effect. In the context of therapeutic or prophylacticapplications, the effective amount will depend on the type and severityof the condition at issue and on the characteristics of the individualsubject, such as general health, age, sex, body weight, and tolerance topharmaceutical compositions. In the context of an antigenic composition,in some embodiments, an effective amount is an amount sufficient toresult in a protective response against a pathogen. In otherembodiments, an effective amount of an antigenic composition is anamount sufficient to result in antibody generation against the antigen.With respect to antigenic compositions, in some embodiments, aneffective amount will depend on the intended use, the degree ofimmunogenicity of a particular antigenic compound, and thehealth/responsiveness of the subject's immune system, in addition to thefactors described above. The skilled artisan will be able to determineappropriate amounts depending on these and other factors. In the case ofa biochemical application, in some embodiments, an effective amount willdepend on the size and nature of the sample in question. It will alsodepend on the nature and sensitivity of the methods in use. The skilledartisan will be able to determine the effective amount based on theseand other considerations.

As used herein, the term “polymer resin” refers to resins, such as, butnot limited to polysaccharide polymers such as agarose, cellulose, andSepharose™. The skilled artisan will understand that proteins may becovalently attached to the resin using methods well known in the art,including but not limited to cyanogen bromide activation, reductiveanimation of aldehydes, and the addition of iodoacetyl functionalgroups. The skilled artisan will further understand that functionalequivalents of polysaccharide polymers may also be to immobilizeproteins.

As used herein, the term “BoNT” refers to any of the seven serologicallydistinct botulinum neurotoxins produced by Clostridium botulinum,Clostridium argentiensis, and Clostridium baratti. Individual serotypesare referred to as BoNT/A, BoNT/B, BoNT/C, BoNT/D, BoNT/E, BoNT/F, andBoNT/G. Exemplary, non-limiting nucleic acid sequences of BoNT/A, /B,/C, /D, /E, /F, and /G are found in GenBank Accession numbers DQ409059,FM865705, AB200364, NZ ACSJ01000015, AM695754, X81714, and X74162,respectively. Exemplary, non-limiting amino acid sequences of BoNT/A,/B, /C, /D, /E, /F, and /G are found in GenBank Accession numbersABD65472, CAR97779, BAD90572, ZP 04863672, CAM91137, CAA57358, andCAA52275, respectively. Exemplary, non-limiting nucleic and amino acidsequences of C. tetani tetanus toxin are found in GenBank Accessionnumbers AF154828 and AAF73267, respectively. As used herein, the term“BoNT/A-L” refers to the full-length botulinum neurotoxin A light chain.As used herein, the term “BoNT/B-L” refers to the full-length botulinumneurotoxin B light chain.

As used herein, the term “anti-BoNT antibody” refers to an antibodycapable of specifically binding to BoNT. As used herein, an antibodyincludes a polyclonal antibody, a monoclonal antibody, and also refersto functional fragments (e.g., fragments which bind an antigen/epitope),such as Fv, Fab, Fc and CDRs.

As used herein, the terms “immunogen” and “immunogenic” refer tomolecules with the capacity to elicit an immune response. The responsemay involve antibody production or the activation of immune cells. Theresponse may occur in vivo or in vitro. The skilled artisan willunderstand that a variety of macromolecule, including proteins, have thepotential to be immunogenic. The skilled artisan will further understandthat nucleic acids encoding a molecule capable of eliciting an immuneresponse necessarily encodes an immunogen. The artisan will furtherunderstand that immunogens are not limited to full-length molecules, butmay include partial amino acid sequences (e.g., epitopes). Moreover,sequences from different sources may be combined to generate mosaicimmunogens, depending on the specific intended use.

As used herein, the terms “isolate” and “purify” refer to processes ofobtaining a biological substance that is substantially free of materialand/or contaminants normally found in its natural environment (e.g.,from the cells or tissues from which a protein is derived, orsubstantially free from chemical precursors or other chemicals whenchemically synthesized).

As used herein, the term the terms “polypeptide,” “peptide,” and“protein” are used interchangeable to mean a polymer comprising two ormore amino acids joined to each other by peptide bonds or modifiedpeptide bonds (i.e., peptide isosteres). Polypeptides may include aminoacids other than the naturally-occurring amino acids, as well as aminoacid analogs and mimetics prepared by techniques that are well known inthe art. The skilled artisan will understand that polypeptides,peptides, and proteins may be obtained in a variety of ways includingisolation from cells and tissues expressing the protein endogenously,isolation from cell or tissues expressing a recombinant form of themolecule, or synthesized chemically.

As used herein, the term “subject” refers to a member of any vertebratespecies. In some embodiments, the subject is avian and includes domestic(e.g., chicken, turkey) and wild bird species. In some embodiments,subjects include mammals such as humans, as well as those mammals ofimportance due to being endangered, of economic importance (animalsraised on farms for consumption by humans) and/or social importance(animals kept as pets or in zoos) to humans. In particular embodiments,the subject is a human. In other embodiments, the subject is not human.

As used herein, the term “pathogen” refers to any entity that causesdisease, including, for example, but not limited to, mycoplasma, fungi,bacteria, viruses, viroids, virus-like organisms, protozoa, andnematodes, toxins, and prions. In some embodiments, the pathogen is aClostridium. In some embodiments, the pathogen is Clostridium botulinum.

As used herein, the term “chimera” and “chimeric” refers to biologicalmolecules comprising materials derived from two or more organisms of thesame or different species. For example, the terms “chimeric antibody,”and “chimeric IgG” refer to antibodies comprising amino acid sequencesderived from two or more organisms of the same or different species. Insome embodiments, the organisms are both of the same species. In someembodiments, the organisms are both human. In some embodiments, theorganisms are from different species. In some embodiments, the termsrefer to nucleic acid sequences encoding chimeric polypeptide sequences.

The present disclosure provides methods and compositions forhigh-throughput production of chimeric antibodies that specifically bindto an antigen of interest. The methods combine three procedures into onestreamlined process: 1) isolation of lymphocytes producing antibodies ofinterest from the blood of immunized individuals, 2) amplification ofsequences encoding variable domains of light and heavy chains ofimmunoglobulin from individual isolated cells, and 3) assembly ofamplified sequences into specially designed vectors and construction ofcells encoding human/human chimeras targeted at antigens of interest.The uniqueness of this process is its ability to generate multiple (upto 100) immunoglobulin-producing clones within a very short time(one-two months). Each such clone encodes an IgG whose variable domainsof light and heavy chains originate from the same lymphocyte.

Since the required antibody-producing blood cells could come from apatient recovered from the infection, this system does not depend on theavailability of a developed vaccine. Consequently, this system could beused to develop protective entities against rare and even new naturaland engineered pathogens at very early signs of appearance.Additionally, the system does not involve use of viruses and,consequently, is safe to use.

The methods allow for rapid generation of IgGs whose heavy chains carryadditional polypeptides at the C-termini. This grants the opportunity toproduce derivatives of antibodies that can be used to monitorcorresponding antigens (IgGs fused with reporter molecules) or toimmobilize those pathogens (IgGs fused with polypeptides like CelluloseBinding Domain). Among other fusions, the system allows creation offusions with Metridia longa luciferase, which allows fast andinexpensive examination of conditions to identify those for optimalproduction of antibodies. Also, the methods allow for the use offluorescence activated cell sorting (FACS) for fast selection of clonesproducing increased levels of IgGs.

The present disclosure provides methods and compositions for robustdevelopment of human antibodies targeted at specific antigens ofinterest. The chosen approach required the ability to 1) isolateindividual human lymphocytes specific to the chosen antigen, 2) isolateimmunoglobulin-encoding sequences from a single selected cell, and 3)assemble immunoglobulin-encoding constructs that can be introduced intochosen cell cultures for production of corresponding antibodies. Priorto this work, it was unknown whether the dynamics of antibody secretionand the limited number of antigen-specific lymphocytes in the peripheralblood would permit efficient separation of these specific cells from allothers. It was unclear whether protocols for rtPCR at the single celllevel would be robust enough to allow their application in a highthroughput format. Finally, described procedures for assemblingexpression vectors carrying IgG-encoding sequences were suitable formanipulation with just a very small number of IgG-encoding sequences ata time. By contrast, suitable methods for high throughput productionmust be capable of simultaneous handling of tens and even hundreds ofdifferent sequences.

In some embodiments, the compositions comprise expression vectorsencoding constant regions of either light or heavy chains of human IgG.In some embodiments, the compositions comprise an expression vectorencoding the constant regions of both the IgG heavy and light chains.

In some embodiments, the methods comprise isolating sequences encodingvariable domains of light and heavy chains of IgG from single cells andassembly of Ig-encoding vectors.

In some embodiments, the methods comprise introducing designedIgG-encoding constructs into mammalian cells and evaluation ofconditions for efficient IgG production. In some embodiments, themethods comprise producing and characterizing chimeric IgGs. In someembodiments, the chimeric IgGs are specific for botulinum neurotoxinserotype A (BoNT/A).

Embodiments described herein are set forth in the following non-limitingexamples.

EXAMPLES Example 1 Development of Expression Vectors

This Example demonstrates the construction of expression vectors for thecloning and production of chimeric IgG antibodies that specifically bindan antigen of interest.

In order to create a system for generation of human antibodies that iscapable of working in a high throughput format, vectors were necessarythat would allow 1) a 100%-certain assembly of sequences encoding lightand heavy chains of immunoglobulins, 2) simple assembly of suchsequences into one plasmid, and 3) robust selection of cells carryingsuch plasmids and expressing both chains of immunoglobulins. PlasmidspVLentry-Hyg10 and pVHentry-Cm5 are designed for assembly ofexpression-competent sequences for light and heavy chains of IgG,respectively, meet all of these requirements (FIG. 1). Specifically,both of these plasmids possess two recognition sites for restrictionendonuclease Esp3I per plasmid and these sites flank the sequenceencoding protein 10b of bacteriophage T7. These two features ensure thatpractically 100% of colonies growing after cloning experiments utilizingvectors pVLentry-Hyg10 and pVHentry-Cm5 carry inserts of interest in apre-determined orientation.

Restriction endonuclease Esp3I cuts DNA outside of its recognitionsequence and generates four nucleotide-long cohesive 5′-overhangingends. As depicted in FIG. 1, each Esp3I cleavage site in plasmidspVLentry-Hyg10 and pVHentry-Cm5 is unique. Therefore, fragmentsgenerated as a result of treatment of these plasmids with Esp3I andremoval of the protein 10b-encoding sequence are not able to form aviable circular DNA unless the reaction is supplemented with a DNAfragment carrying appropriate sticky ends. As demonstrated in FIG. 2,the insertion of such a DNA fragment will occur only in one orientation,thus eliminating the need for following analysis of recombinant clones.The sequence encoding protein 10b of bacteriophage T7 functions as asafeguard, preventing re-assembly of the original vector.

In our vectors, its expression is controlled by the lactose promoter.Expression of this sequence is lethal to F plasmid-containing E. coli(17). Therefore, while our vectors are maintained in F-negative cells,cloning experiments require strains carrying F factor and, aftertransformation, cells are grown in the presence of IPTG and thecorresponding antibiotic (ampicillin in the case of plasmidpVLentry-Hyg10 and chloramphenicol in the case of plasmid pVHentry-Cm5).Under these conditions, only cells carrying plasmids in which theprotein 10b-encoding fragment has been substituted with a new insertsurvive.

Another important element of our vectors is a strong promoter that candirect transcription of the inserted sequence in mammalian cells. Invectors pVLentry-Hyg10 and pVHentry-Cm5, this role is served by thesequence from cytomegalovirus (CMV). However, we also designed plasmidsin which a sequence from Rouse Sarcoma virus is used for this purpose.Plasmids pVLentry-Hyg10 and pVHentry-Cm5 are designed in such a way thattranscripts initiated from the CMV promoter incorporate not only asequence lying immediately downstream of the promoter, but also anInternal Ribosome Entry Site (IRES) and sequence for antibioticresistance. In the case of plasmid pVLentry-Hyg10, this is resistance toHygromycin B and, in the case of plasmid pVHentry-Cm5, this sequenceconfers resistance to G418. Presence of IRES makes synthesis ofantibiotic-inactivating protein proportional to synthesis of proteinencoded by the preceding portion of the transcript (immunoglobulin chainin the derivatives of these plasmids). This feature is not absolutelynecessary for selection of stable transfectants (in some of our plasmidsit is not present), however, it makes further maintenance of selectedclones easier and opens opportunities for their further improvement.

In addition, design of our vectors allows simple combination ofsequences encoding light and heavy chains of IgG in the same plasmid,which, in turn, ensures equal amounts of IgG chain-encoding sequences tobe introduced into the cell during transfection. I-SceI recognitionsites are one of elements enabling such combination.

I-SceI is a site-specific homing endonuclease that recognizes an 18nucleotide-long sequence and generates DNAs with cohesive ends that canbe used for cloning. Due to the length of the target sequence, itsoccurrence in the sequence encoding a variable domain of Ig ispractically impossible. Therefore, using this enzyme enabled transfer ofentire IgG-encoding sequences from one plasmid into another withoutdestroying the integrity of these sequences. Nonsymmetrical cohesiveends generated by the I-SceI 1 ensure that, in all generated plasmids,relative orientation of IgG-encoding sequences is the same. This featureallows further improvement of the reproducibility of IgG productionexperiments. As shown in FIG. 1, plasmids pVLentry-Hyg10 andpVHentry-Cm5 possess two I-SceI sites each. However, in plasmidpVLentry-Hyg10, I-SceI sites flank the Ig-encoding cassette, while inplasmid pVHentry-Cm5, both I-SceI sites are located on one side of theIg-encoding cassette and flank the gene of the alpha peptide ofbeta-galactosidase (lacZ′).

In addition to differences in location of I-SceI sites, both plasmidspossess different antibiotic-resistance markers. Both of these plasmidsuse the same origin of replication for propagation in E. coli cells andtherefore are not be able to coexist in the same cell. All of thesefeatures allow us to speed up the process of assembly and identificationof the plasmid carrying both L- and H-chain encoding sequences. Indeed,a simple treatment of the mixture of L- and H-chain encoding plasmidswith I-SceI and ligase generates the required hybrid plasmid. Similarlyto one of its parents, this plasmid inherits thechloramphenicol-resistance gene, while, unlike this parent, it will notbe able to produce the alpha-peptide of beta-galactosidase. As a result,only cells carrying the required plasmid and not the three otherspresent in the mixture are able to form white colonies on the mediasupplemented with chloramphenicol, X-Gal andisopropyl-β-D-thiogalactopyranoside (IPTG).

Also disclosed are four derivatives of plasmid pVHentry-Cm5. Thesederivatives have all elements described above. However, instead of thesequence encoding the constant part of IgG heavy chain alone, all theseplasmids contain sequences that encode fusions of the same part of IgGheavy chain with different polypeptides. One of them encodes a fusionwith green fluorescent protein (GFP), the second—a fusion withluciferase from Metridia longa (MLuc) (18, 19), the third—a fusion withHis-tag and a peptide that can be biotinylated by biotin ligase, and thefourth—a fusion with a polypeptide that specifically binds cellulose(20, 21).

Example 2 Isolation of Sequences Encoding Variable Domains of Light andHeavy Chains of IgG

A single individual who was vaccinated with pentavalent botulinum toxoidvaccine six years prior received several boosts and served as a donor ofblood cells. These cells were subject to fractionation on Ficollgradient, enrichment on BD IMag™ Anti-human CD19 Particles-DM, and,finally, cell sorting. As a marker for cells producing anti-BoNT/A, weused a fusion between Green Fluorescent Protein and thereceptor-recognizing domain of BoNT/A (gfpBoNT/A-CH5). This protein wasconstructed in our lab and, prior to use in cell sorting experiments,was tested for the ability to recognize specific receptors present inneuroblastoma cells (FIG. 3).

Cells simultaneously binding APC-Mouse-anti-human CD19 and gfpBoNT/A-CH5were sorted into wells of a 96-well plate, one cell per well.

Isolated cells were used as a source of sequences encoding V_(H)- andV_(L)-regions. We have developed a procedure for rtPCR of thesesequences that includes three steps: 1) reverse transcription of mRNAreleased from the cell by perfringolysin O, 2) simultaneousamplification of cDNAs encoding V_(H)- and V_(L)-regions in the sametube by PCR and 3) re-amplification of sequences encoding each region inits own tube. Each step has its own set of primers. The whole proceduretakes less than 8 hours. The number of cells that can be processedduring this time is mostly limited by the capacity of the availablethermo-cycler. Primers were designed based on the analysis of availablehuman Ig-encoding sequences known in the art (8, 22). Primers usedduring each step are summarized in Table 1. Primers used in there-amplification step were designed to introduce unique sequences, whichcan be converted into four-nucleotide-long cohesive ends compatible withends generated by Esp3I restriction endonuclease in the correspondingvectors (see previous section), into the ends of amplified fragments.The conversion occurs as a result of treatment of purified DNA fragmentsby DNA polymerase T4 in the presence of dCTP as demonstrated in FIG. 2.The lack of restriction endonucleases at this stage guarantees that noneof the sequences is lost due to the presence of sites for correspondingrestriction endonucleases in some of them.

TABLE 1 Primers used for amplification of sequencesencoding variable domains of human immunoglobulins.Primers used for reverse transcription IgG- GGGGAAGAGGAAGACTGACGGTC CHHCm1 CAGTACTGCGATGAGTGGCA Clv-3 TGTGGCCTTGTTGGCTTG Oligo dTPrimers used at the PCR amplification stage pVk-1GAGTCAGDYYCDRYCAGGACACAGCATG pVk-2 AGACCCTGTCAGGACACAGCATAGACATG pVk-3GGACTCCTCAGTTCACCTTCTCACAATG pVk-4 TGCTCAGTTAGGACCCAGAGGAACCATG hIgGk-3TAATGGCCTAACACTCTCCCCTGTTGAAGCTCTT IgGH-1 TGAGVDMMGYWCHTCACCATGGACTGIgGH-2 ACTGAACACAGAGGACTCACCATGGA IgGH-3 CAGTGACTCCTGTGCCCCACCATGGACAIgGH-4 TTTCTGTCCTCCACCATCATGGGGTC IgGH-5 GCACTGAACACAGACCACCAATCATGGIgG- GGGGAAGAGGAAGACTGACGGTC CHH M1 CCTGGGAGCACAGCTCATCACCATGGA M2CACTGAACACAGAGGACTCACCATGGA M3 CATGGACCTCCTGCACAAGAACATGAA M4ACTGAACAGAGAGAACTCACCATGGA Cm1 CAGTACTGCGATGAGTGGCA Vl1-5T7TTTAGGCCATGGCCTGGACCCCTCTCCTGCTC Vl2-5T7TTTAGGCCATGGCCTGGACCKTTCTCCTCCTC Vl3-5T7TTTAGGCCATGGCCTGGDCTCYKCTCCTYCTC Vl4-5T7TTTAGGCCATGGCATGGCCAGCTTCCCTCTCCTCCTC Vl5-5T7TTTAGGCCATGACCTGCTCCCCTCTCCTCCTC C1-3 CCTGCAGCTCTAGTCTCCCGTGGPrimers used at the re-amplification stage Vk-l/2-TTTAGGCATGGACATGAGGGTCCCCGCTCAGCTCCTGG 5T7 Vk-3-5T7TTTAGGCATGGAAACCCCAGCGCAGCTTCT Vk-4-5T7 TTTAGGCATGGTGTTGCAGACCCAGGTCTThIgGk-3 TAATGGCCTAACACTCTCCCCTGTTGAAGCTCTT IgG-CHTATTGGCGAGCTGGCCTCTCACCAACTGTCTTGTCCAC CTTGGTGTTG Vh-1-3T7CACTGGAGACGGTGACCAGBGTBCCYTGKCCCCA Vh-1-3T75TATTGGCactcacggaagagacggtgaccagBgtBccYtg Vh-1-5T7TATAGccatggactggacctgga Vh-2-5T7 TATAGccatggacatactttgttccac Vh-3-5T7TATAGccatggagtttgggctgagc Vh-4-5T7 TATAGccatgaaacacctgtggttctt Vh-5-5T7TATAGccatggggtcaaccgccatcct Vh-6-5T7 TATAGccatgtctgtctccttcctcatVh-7-5T7 TATAGccatggaatttgggettagct Vh-8-5T7 TATAGccatggaattggggctgagVh-1-3T75 TATTGGCactcacggaagagacggtgaccagBgtBccYtg Vm-1-5T7TATAGaccatggactggacctggaggttcct Vm-2-5T7 TATAGaccatggagtttgggctgagctgggtVm-3-5T7 TATAGaacatgaaacacctgtggttcttcct Vh-1-3T75TATTGGCactcacggaagagacggtgaccagBgtBccYtg Vl1-5T7TTTAGGccatggcctggacccctctcctgctc Vl2-5T7TTTAGGccatggcctggacckttctcctcctc Vl3-5T7TTTAGGccatggcctggdctcykctcctyctc Vl4-5T7TTTAGGccatggcatggccagcttccctctcctcctc Vl5-5T7TTTAGGccatgacctgctcccctctcctcctc hIgG1-3 taatggcCTATGAACATTCTGTAGGGGCCAC

In the end, only 24% of originally sorted cells produced sequences forboth V_(H)- and V_(L)-regions. This may sound like a relatively lowsuccess rate. However, given the potential of collecting hundreds ofcells and the ability to process them in just few days, this allows theaccumulation of tens of pairs of sequences for further antibodyassembly. In the future, we expect to increase this rate by includinganti-CD27 or anti-B220 monoclonal antibodies in the cell sortingprotocol and thus increase the number of those among selected cells thatproduce antibodies versus those that may just absorb them.

Sequencing of 11 pairs of isolated DNA fragments revealed thatpractically all pairs were unique. Even when two pairs had one identicalchain, the second chains were different (Sequences of variable domainsof light and heavy chains are listed in Appendix 2 and 3).

Example 3 Introduction of Designed IgG-Encoding Constructs intoMammalian Cells and Evaluation of Conditions for Efficient IgGProduction

Eight pairs of isolated sequences were incorporated into thepreviously-described vectors and the resulting plasmids were introducedinto CHO and HEK cells. ELISA registered accumulation of humanantibodies in media of both of these cultures. In isolated stable celllines, the level of production varied but did not exceed 1-2 μg/ml (thelevel of production was determined on the basis of the amount ofanti-BoNT/A purified from 100 ml of culture media—will be describedbelow). In our experience, HEK cells proved to be more robust andcapable of producing more antibodies from the same volume of media.Also, these cells were easier to adapt to grow and produce IgGs in theserum-free media. This is why, in most of our later analyses, wepreferred to use HEK cells.

To select clones with higher production, we decided to use correlationbetween translations of sequences encoding light and heavy chains ofIgGs and those encoding antibiotic-inactivating proteins, built into oursystem and discussed earlier. Specifically, by gradually increasingamounts of antibiotics in the culture media, we were able to select celllines whose resistance to antibiotics is 3-4 times higher thanresistance of originally selected cultures. As demonstrated in FIG. 4,ELISA revealed that cells with increased resistance to antibiotics didnot produce substantially more immunoglobulins than cells possessing alower level of resistance to these antibiotics.

This data suggest that the bottleneck of production lies somewhere atthe post-translational level. The conventional way for identifying cellswith increased production of IgGs is a limiting dilution cloning. Thelow throughput nature of this method significantly limits the number ofclones that can feasibly be screened. We tested whether fluorescenceactivated cell sorting (FACS) can be used to increase throughput. As amarker for IgG-producing cells, we used previously mentionedgfpBoNT/A-CH5. Cells were released from the solid support via treatmentwith trypsin and washed two times with fresh RPMI media to removetrypsin. Then, cells were incubated in RPMI media for 1 hour,co-incubated with gfpBoNT/A-CH5 for 10 minutes and subject to FACS. Outof the 1% of cells with the highest fluorescence intensity,corresponding to the highest antibody production rates, single cellswere sorted directly into 96-well plates at one cell per well. One platewas assembled per each IgG-producing cell line. Table 2 demonstratesthat we were able to find clones with increased production ofIgG-luciferase hybrids for five cell lines out of seven used in theexperiment. These results clearly demonstrate the potential of FACS forfurther development of cell lines producing high quantities of IgGs.

TABLE 2 Production of IgG-MLuc by original cultures and individualclones selected from these cultures Original culture Luminescence CloneLuminescence HEK-1HL-MLuc 657,148 1E7 1,641,522 HEK-7HL-MLuc 1,387,9807B8 8,013,339 HEK-8HL-MLuc 981,702 8E8 3,783,486 HEK-9HL-MLuc 1,991,5129F6 2.778.794 HEK-14HL-MLuc 951,132 14G11 721,576 HEK-15HL-MLuc 104,46615F2 594,677 HEK-41HL-MLuc 3,274,119 41C9 3,163,750

Production of the chimera IgGs and their characterization. As result ofthe reasons mentioned in the previous section, most of the IgGconstructs were purified from culture media of HEK cells. Our analysisof accumulation of luciferase activity in the culture media of two celllines encoding IgG-MLuc fusions revealed that the accumulation in bothcontinued for seven days. Therefore, all HEK cultures were grown forseven days in the same media, which was then passed through a columncontaining the hybrid between staphylococcal protein A and streptococcalprotein G. In the case of CHO cells, the media was collected after threedays. Elution of absorbed IgGs was achieved by a buffer change to 0.1 Mglycine HCl (pH 2.3). Immediately after elution, the pH of collectedfractions was increased by addition of 1 M Tris-Base. Then, fractionswere subjected to buffer exchange and concentrated by ultrafiltration.

In addition to IgGs alone, we purified fusions of these IgGs withluciferase, GFP, and His-tag connected to the peptide that serves as atarget for biotin ligase (BirA). Analysis confirmed the presence ofpolypeptides with expected molecular weights and recognized byanti-human antibodies in isolated fractions (FIG. 5).

Fractions with IgG-MLuc fusions produced light in the presence ofluciferase's substrate—coelenterazine. The IgG-GFP fusion emitted thegreen light characteristic of GFP upon illumination with UV light.Finally, the IgG fusion with His-tag and BirA substrate interacted withNi-column and, after treatment with BirA in the presence of biotin andATP, was recognized by streptavidin-alkaline phosphatase substrate (datanot presented).

ELISA revealed that out of eight different IgGs that we purified, alleight recognize the receptor-recognizing domain of BoNT/A (FIG. 6). Thisdata suggests that practically all isolated cells from which we wereable to recover IgG-encoding sequences produced BoNT/A-specificantibodies.

IgGs were recognized by hybrid proteins composed of staphylococcalprotein A, streptococcal protein G and Metridia longa luciferase(spAG-MLuc and spAG-ΔN-MLuc) and developed in our lab (sequences ofplasmids encoding these proteins are presented in Appendix 4). Thesehybrids allowed quantitative monitoring of IgG present in wells of96-well plate. Hybrid spAG-MLuc possessed luciferase activity only whenit was purified from culture media of mammalian cells. HybridspAG-ΔN-MLuc possesses luciferase activity irrespective to where it wasexpressed, E. coli or mammalian cells.

Examples 1-3 demonstrate 1) the number of peripheral blood cellsencoding specific IgGs in blood and the efficiency of cell sortingprotocols used are sufficient to produce hundreds of cells that canserve as a source of Ig-encoding sequences; 2) the methods disclosedherein permit reliable isolation of cDNA encoding variable domains ofboth Ig-chains from ⅕ of all isolated individual lymphocytes; 3)practically all isolated cDNA pairs encode IgG specific to the antigenused in the cell sorting procedure; 4) the expression vectors describedherein are suitable for high throughput assembly of plasmids encodingboth full size human IgGs, as well as their derivatives carryingpolypeptides that allow monitoring or/and specific binding of these IgGsto other molecules; 5) the vectors allow efficient selection of cellsproducing both IgG chains; and 6) FACS can be used as an efficient toolallowing selection of clones producing increased quantities of IgGs andtheir derivatives.

Accordingly, the compositions and methods described herein are useful inmethods comprising one or more of these aspects.

Example 4 Construction and Expression of Libraries of Anti-BotulinumChimeras that Recognize Regions of BoNT/A

This example demonstrates the construction and use of libraries ofanti-botulinum chimeras that recognize regions of BoNT/A.

First, we will use conventional methods of gene engineering to createfusions of corresponding domains with GFP. Similar topreviously-mentioned gfpBoNT/A-CH5, these fusions will be used asmarkers for lymphocytes producing antibodies specific for catalytic andtransport domains of BoNT/A. As a source of lymphocytes, we will usewhite blood cells from the blood of an immunized individual that weregenerated and tested previously, and preserved under liquid nitrogen. Ithas been demonstrated that such cells can be used as a source ofimmunoglobulin-encoding sequences (25). These cells will be subjected toenrichment on BD IMag™ Anti-human CD19 Particles-DM and then sorted intowells of a 96-well plate, one cell per well. Prior to FACS, cells willbe labeled with APC Mouse Anti-Human CD19 (BD Biosciences) and thecorresponding GFP-BoNT/A fusion. To increase the level of discriminationof IgG-producing cells from those that do not produce, but insteadabsorb them from serum, we will include an additional marker—memory Bcell marker. Bleesing and Fleisher reported that human B cells exposeeither B220 or CD27 on their surface [30]. Therefore, as the thirdcomponent of the cell labeling mixture, we will use anti-CD27 (AncellCo.) and/or anti-B220 (Beckman Coulter) monoclonal antibodies, eachconjugated to R-Phycoerythrin.

Isolated cells will be used as a source of sequences encoding V_(H)- andV_(L)-regions. Isolation and further handling of these sequences will bedone according to protocols described above. At this stage, the goalwill be to isolate 10-20 V_(H)- and V_(L)-encoding pairs that haveunique sequences per each BoNT/A domain.

Unique V_(H)- and V_(L)-encoding pairs will be used to assemble andproduce human/human IgG chimeras as described above.

Example 5 Identification of IgGs and Their Combinations that canNeutralize Toxic Activity of BoNT/A

This Example demonstrates the identification of chimeric IgG antibodieswith the capacity to neutralize toxicity of BoNT/A using phage display.

Choosing V_(H)- and V_(L)-encoding pairs with unique sequences does notguarantee that they will recognize different epitopes. Therefore, priorto conducting expensive toxin neutralizing experiments, we will sortdeveloped IgGs according to their epitope specificities. For this, wewill use phage display known in the art. This technology involves alibrary of random peptides. Sequences of these peptides are incorporatedin the region of the phage genome that encodes the capsid protein. As aresult, each phage particle in the library encodes and exposes on itssurface only one type of peptide. We previously demonstrated thatincubation of such a library with immobilized polyclonal antibodiesraised against BoNT/A allows isolation of phage particles that encodepeptides mimicking BoNT/A epitopes (mimetics).

We will use a similar approach to sort developed IgGs according to theirepitope specificities. Specifically, each developed IgG will be purifiedand immobilized on a solid support. Then, each immobilized IgG will beco-incubated with the phage display library MD-12™ (Alpha Universe,LLC). Phages that do not bind to IgG will be removed by washing andthose bound to IgG will be released and grown on appropriate host cells.Following this amplification, phages will be subjected to two additionalcycles of the above-described screening procedure. According to ourprevious experience, practically all phages released after the thirdcycle will possess affinity to the IgG used in selection. To ensure thatselected phages carry mimetics of BoNT/A, we have to prevent isolationof phages that interact with IgG parts other than the antigen-bindingregion. In order to do this, phages will be subject to depletion withhuman naïve serum every time prior to incubation with immobilizeddeveloped IgG. After mixing with phages, components of human naïveserum, as well as phage particles bound to them, will be removed byaddition of magnetic beads with immobilized staphylococcal proteinA-streptococcal protein G hybrid to the mixture.

Individual phages carrying BoNT/A mimetics will be used forcharacterization of developed IgGs. Specifically, each IgG will beimmobilized on wells of a 96-well plate and each immobilized IgG will beincubated with all chosen mimetic-exposing phages. Wells with boundphages will be identified using M13 phage-specific antibodies conjugatedwith horse radish peroxidase (GE Healthcare) and 1-Step™ Slow TMB-ELISA(PIERCE). IgGs interacting with the same phage will be considered asrecognizing the same epitope.

In addition to classification of developed IgGs according to theirepitope (actually, mimetic) specificity, we will characterize these IgGsaccording to the nature of recognized epitopes (linear or structural).In these experiments, we will compare interaction of developed IgGs withcorresponding recombinant domains subjected or not subjected todenaturing treatment. For this, corresponding BoNT/A fragments will besubjected to native or SDS polyacrylamide gel electrophoresis,transferred onto a nitrocellulose membrane and probed with each chosenIgG separately. Then, filters will be treated with biotinylatedanti-human IgGs, followed by treatment with streptavidin-horse radishconjugate and Metal Enhanced DAB Substrate Kit (Pierce, Inc.). IgGsrecognizing both forms of BoNT/A fragment will be considered asrecognizing linear epitopes. Those that recognize only BoNT/A fragmentsnot subjected to denaturing conditions will be considered as recognizingstructural epitopes.

The information about the nature of the recognized epitope will not onlybe used to verify epitope-based grouping of IgGs, but also to gaininformation about locations of corresponding epitopes on the BoNT/Amolecule. Specifically, our previous experience suggests that, in thecase of mimetics of linear epitopes, some similarities between sequencesof these mimetics and the BoNT/A sequence can be observed. Suchsimilarities may be used as indicators of the location of thecorresponding epitope in the structure of the molecule.

After developed IgGs are classified and grouped, representatives fromeach group will be tested for the ability to neutralize BoNT/A.

It has been demonstrated that even when individual monoclonal antibodiesdo not have substantial protective activity, their combination may havesuch activity (24). This is why the analysis will include testing of theBoNT/A-neutralization potential of each chosen IgG separately and, then,testing of such potential for selected groups of IgGs.

The goal of this analysis will be to identify IgGs or their combinationsthat will be able to protect mice from at least 1000 minimal doses thatare lethal to a fifty percentage of mouse (MLD₅₀) of BoNT/A. Inaddition, the aim will be to determine which among three regions of theBoNT/A molecule (catalytic, transport, or receptor-recognizing) containsthe highest number of protective epitopes. This information will beinstrumental for development of antibodies capable of neutralizing otherserotypes of BoNTs.

Example 6 Development of Human/Human IgG Chimeras Capable ofNeutralizing BoNT/B

This Example demonstrates the development of human/human IgG chimerascapable of neutralizing BoNT/B.

Previously, we demonstrated that different serotypes of BoNTs havesimilar epitopes and information about locations of epitopes in oneserotype can be used to predict locations of epitopes in other serotypes(26). We will use this phenomenon to speed up the process of developmentof IgGs capable of neutralizing BoNT serotype B. Specifically, insteadof developing IgGs to the whole molecule of BoNT/B, we will focus onjust one region. This region will be the same one as that revealed inBoNT/A as possessing the most potent protective epitopes. We will createa fusion between GFP and a fragment of BoNT/B after the targeted regionof BoNT/B is determined. This fusion will be used to isolatecorresponding lymphocytes from the same cryopreserved fractions of bloodcells mentioned earlier. FACS and following isolation of cDNAs, theirPCR, cloning, expression of assembled sequences, purification of IgGs,and analysis of their protective properties will be done the same way asdescribed in the previous two sections.

As in case with BoNT/A, our goal will be to identify IgGs or theircombinations that will ensure protection of mice from at least 1000MLD₅₀.

Optimization of protocols for production of chosen chimeras. The abilityto efficiently produce developed protective IgGs is a key element forthe system to become a commercially viable. Earlier analysis ofdifferent monoclonal antibody-producing cell lines conducted byO'Callaghan and coauthors revealed that each cell line had its ownbottleneck, limiting production of antibodies (27). This researchsupports the approach for selection of high producers from population ofcells already producing IgG. This approach has been successfully used bymany groups including ourselves. However, such selection often requiresmultiple cycles and is very lengthy. Development of a strain withbottlenecks that are widened or even removed will substantially increasethe potential for high throughput development of cells producing highquantities of IgGs. Recent reports of successful increase of antibodyproduction via introduction of specific DNA sequences into the cellssuggest the possibility of such an approach (28-30).

To create a cell line originally capable of producing increasedquantities of IgGs, we will produce IgG derivatives carrying differentpolypeptides on the C-termini of heavy chains. Specifically, we willengineer a plasmid encoding one of the anti-BoNT/A IgGs fused with thetrans-membrane domain of platelet derived growth factor receptor (31).This plasmid will allow generation of transiently transfected cellsexpressing IgG anchored in the cell membrane. Such cells will be stainedwith gfpBoNT/A-CH5 and subjected to FACS. Individual cells carrying thehighest levels of fluorescent label will be sorted into wells of a96-well plate and allowed to grow. We anticipate that the majority ofsuch cells will lose IgG-encoding plasmids. As a result, such cells willstop producing the corresponding IgG derivative andantibiotic-inactivating enzymes encoded by the plasmid. Cell lines grownfrom such cells will be transfected again. This time, we will use theplasmid encoding IgG-luciferase hybrid formed by different V_(H)- andV_(L)-pair that was used in the previous transfection. Parental celllines for those transient transfectants whose culture media contains thehighest amounts of luciferase will be tested further for the ability toproduce high quantities of other types of IgG-luciferase fusions.Eventually, we expect to be able to isolate a cell line that willproduce increased quantities if IgGs irrespective of sequences of theirV_(H)- and V_(L)-regions.

To increase the success rate of the above-described selection, we willuse a cell line whose diversity will be increased by chemicalmutagenesis. Further, to eliminate difficulties associated with sortingoriginally adherent cells, we will use FREESTYLE™ CHO-S® cells(Invitrogen, Inc.). This cell line has been adapted to grow insuspension in serum-free media. The latter feature will beneficial forfuture production of antibodies.

Even with a developed host cell line capable of increased production ofIgGs, we do not exclude the need for additional selection ofsuper-producers among created IgG-producing cells. Traditionally, suchselection is done by Limiting dilution cloning, which is a verylabor-intensive process. We will use FACS protocols for the isolation ofcells that bind the highest amounts of the label after a very shortexposure to it from the population, followed by isolation of cells thatlose this label faster than others.

As a result of these activities, we will not only generate cell linesproducing high quantities of chosen IgGs, but will also determine thebest way to efficiently develop new IgG-producing cell lines.

REFERENCES

-   1. Smith, K., Garman, L., Wrammert, J., Zheng, N., Capra, J. D., and    Wilson, P. C. (2009) Nat Protoc. 4, 372-384-   2. Amon, S. S., Schechter, R., Inglesby, T. V, Henderson, D. A.,    Bartlett, J. G., Ascher, M. S., Eitzen, E., Fine, A. D., Hauer, J.,    Layton, M., Lillibridge, S., Osterholm, M. T., O'Toole, T., Parker,    G., Perl, T. M., Russell, P. K., Swerdlow, D. L., and    Tonat, K. (2001) Jama 285, 1059-1070 [online]    http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=11209178.-   3. St John, R., Finlay, B., and Blair, C. (2001) The Canadian    journal of infectious diseases=Journal canadien des maladies    infectieuses 12, 275-84 [online]    http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2094836&tool=pmcentrez&rendertype=abstract    (Accessed Nov. 23, 2012).-   4. Smith, L. A., and Rusnak, J. M. (2007) Critical reviews in    immunology 27, 303-18 [online]    http://www.ncbi.nlm.nih.gov/pubmed/18197811 (Accessed Nov. 21,    2012).-   5. Notice of CDC's discontinuation of investigational pentavalent    (ABCDE) botulinum toxoid vaccine for workers at risk for    occupational exposure to botulinum toxins (2011) MMWR Morb Mortal    Wkly Rep 60, 1454-1455 [online]    http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=22031218.-   6. Clayton, M. A., Clayton, J. M., Brown, D. R., and    Middlebrook, J. L. (1995) Infect Immun 63, 2738-42.-   7. Black, R. E., and Gunn, R. A. (1980) The American journal of    medicine 69, 567-70 [online]    http://www.ncbi.nlm.nih.gov/pubmed/7191633 (Accessed Nov. 23, 2012).-   8. Wang, X., and Stollar, B. D. (2000) 244, 217-225-   9. Orlandi, R., Gussow, D. H., Jones, P. T., and Winter, G. (1992)    Biotechnology 24, 527-31.-   10. Beidler, C. B., Ludwig, J. R., Cardenas, J., Phelps, J.,    Papworth, C. G., Melcher, E., Sierzega, M., Myers, L. J., Unger, B.    W., and Fisher, M. (1988) J Immunol 141, 4053-60.-   11. Zhao, Y., and Hammarström, L. (2003) Immunology 108, 288-95    [online]    http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1782897&tool=pmcentrez&rendertype=abstract    (Accessed Nov. 14, 2012).-   12. CDC (2011) MMWR. Morbidity and mortality weekly report 60,    1454-5 [online] http://www.ncbi.nlm.nih.gov/pubmed/22031218    (Accessed Aug. 24, 2012).-   13. Beidler, C. B., Ludwig, J. R., Cardenas, J., Phelps, J.,    Papworth, C. G., Melcher, E., Sierzega, M., Myers, L. J., Unger, B.    W., and Fisher, M. (1988) Journal of immunology (Baltimore,    Md.: 1950) 141, 4053-60 [online]    http://www.ncbi.nlm.nih.gov/pubmed/3141512 (Accessed Nov. 24, 2012).-   14. Gillies, S. D., Lo, K. M., and Wesolowski, J. (1989) Journal of    immunological methods 125, 191-202 [online]    http://www.ncbi.nlm.nih.gov/pubmed/2514231 (Accessed Nov. 24, 2012).-   15. Norderhaug, L., Olafsen, T., Michaelsen, T. E., and    Sandlie, I. (1997) Journal of immunological methods 204, 77-87    [online] http://www.ncbi.nlm.nih.gov/pubmed/9202712 (Accessed Nov.    24, 2012).-   16. Liu, A. Y., Mack, P. W., Champion, C. I., and    Robinson, R. R. (1987) Gene 54, 33-40 [online]    http://www.ncbi.nlm.nih.gov/pubmed/3111940 (Accessed Nov. 24, 2012).-   17. Schmitt, C. K., and Molineux, I. J. (1991) Journal of    bacteriology 173, 1536-43 [online]    http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=207293&tool=pmcentrez&rendertype=abstract    (Accessed Nov. 10, 2012).-   18. Markova, S. V, Golz, S., Frank, L. A., Kalthof, B., and    Vysotski, E. S. (2004) The Journal of biological chemistry 279,    3212-7 [online] http://www.ncbi.nlm.nih.gov/pubmed/14583604    (Accessed Nov. 24, 2012).-   19. Markova, S. V, Burakova, L. P., and Vysotski, E. S. (2012)    Biochemical and biophysical research communications 417, 98-103    [online] http://www.ncbi.nlm.nih.gov/pubmed/22138240 (Accessed Jul.    20, 2012).-   20. Shpigel, E., Goldlust, a, Efroni, G., Avraham, a, Eshel, a,    Dekel, M., and Shoseyov, O. (1999) Biotechnology and bioengineering    65, 17-23 [online] http://www.ncbi.nlm.nih.gov/pubmed/10440667.-   21. Cao, Y., Zhang, Q., Wang, C., Zhu, Y., and Bai, G. (2007)    Journal of chromatography. A 1149, 228-35 [online]    http://www.ncbi.nlm.nih.gov/pubmed/17391680 (Accessed Jul. 20,    2012).-   22. Smith, K., Garman, L., Wrammert, J., Zheng, N., Capra, J. D.,    Ahmed, R., and Wilson, P. C. (2009)-   23. Adekar, S. P., Takahashi, T., Jones, R. M., Al-Saleem, F. H.,    Ancharski, D. M., Root, M. J., Kapadnis, B. P., Simpson, L. L., and    Dessain, S. K. (2008) PloS one 3, e3023 [online]    http://dx.plos.org/10.1371/journal.pone.0003023 (Accessed Nov. 15,    2012).-   24. Nowakowski, A., Wang, C., Powers, D. B., Amersdorfer, P.,    Smith, T. J., Montgomery, V. A., Sheridan, R., Blake, R., Smith, L.    A., and Marks, J. D. (2002) Proceedings of the National Academy of    Sciences of the United States of America 99, 11346-50 [online]    http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=123259&tool=pmcentrez&rendertype=abstract    (Accessed Nov. 25, 2012).-   25. Hansen, A., Reiter, K., Dorner, T., and Pruss, A. (2005) Cell    Tissue Bank 6, 299-308 [online]    http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16308769.-   26. Zdanovsky, A., Zdanovsky, D., and Zdanovskaia, M. (2012)    Toxicon: official journal of the International Society on Toxinology    60, 1277-86 [online] http://www.ncbi.nlm.nih.gov/pubmed/22922018    (Accessed Nov. 4, 2012).-   27. O'Callaghan, P. M., McLeod, J., Pybus, L. P., Lovelady, C. S.,    Wilkinson, S. J., Racher, A. J., Porter, A., and James, D. C. (2010)    Biotechnology and bioengineering 106, 938-51 [online]    http://www.ncbi.nlm.nih.gov/pubmed/20589672 (Accessed Nov. 26,    2012).-   28. Florin, L., Pegel, A., Becker, E., Hausser, A., Olayioye, M. A.,    and Kaufmann, H. (2009) Journal of biotechnology 141, 84-90 [online]    http://www.ncbi.nlm.nih.gov/pubmed/19428735 (Accessed Nov. 16,    2012).-   29. Peng, R., Abellan, E., and Fussenegger, M. (2011) Biotechnol    Bioeng 108, 611-620-   30. Peng, R.-W., and Fussenegger, M. (2009) Biotechnology and    bioengineering 102, 1170-81 [online]    http://www.ncbi.nlm.nih.gov/pubmed/18989903 (Accessed Nov. 27,    2012).-   31. Zhou, C., Jacobsen, F. W., Cai, L., Chen, Q., and Shen, W. D.    mAbs 2, 508-18 [online] http    ://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2958572&tool=pmcentrez&rendertype=abstract    (Accessed Nov. 16, 2012).

APPENDIX 1 Nucleotide sequences of constructed plasmids pVLentry-Hyg10:1TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG ACGTCAATAA TGACGTATGT TCCCATAGTA ACGCCAATAG GGACTTTCCA TTGACGTCAA ACCGGGCGGA CCGACTGGCG GGTTGCTGGG GGCGGGTAAC TGCAGTTATT ACTGCATACA AGGGTATCAT TGCGGTTATC CCTGAAAGGT AACTGCAGTT 101TGGGTGGAGT ATTTACGGTA AACTGCCCAC TTGGCAGTAC ATCAAGTGTA TCATATGCCA AGTACGCCCC CTATTGACGT CAATGACGGT AAATGGCCCG ACCCACCTCA TAAATGCCAT TTGACGGGTG AACCGTCATG TAGTTCACAT AGTATACGGT TCATGCGGGG GATAACTGCA GTTACTGCCA TTTACCGGGC 201CCTGGCATTA TGCCCAGTAC ATGACCTTAT GGGACTTTCC TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC ATGGTGATGC GGTTTTGGCA GGACCGTAAT ACGGGTCATG TACTGGAATA CCCTGAAAGG ATGAACCGTC ATGTAGATGC ATAATCAGTA GCGATAATGG TACCACTACG CCAAAACCGT 301GTACATCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA TTTCCAAGTC TCCACCCCAT TGACGTCAAT GGGAGTTTGT TTTGGCACCA AAATCAACGG CATGTAGTTA CCCGCACCTA TCGCCAAACT GAGTGCCCCT AAAGGTTCAG AGGTGGGGTA ACTGCAGTTA CCCTCAAACA AAACCGTGGT TTTAGTTGCC 401GACTTTCCAA AATGTCGTAA CAACTCCGCC CCATTGACGC AAATGGGCGG TAGGCGTGTA CGGTGGGAGG TCTATATAAG CAGAGCTGGT TTAGTGAACC CTGAAAGGTT TTACAGCATT GTTGAGGCGG GGTAACTGCG TTTACCCGCC ATCCGCACAT GCCACCCTCC AGATATATTC GTCTCGACCA AATCACTTGG                 Esp3I                 ~~~~~~~ 501GTCAGATCCG CTAGACGTCT CATTTAACTT TAAGAAGGAG ATATACATAT GGCTAGCATG ACTGGTGGAC AGCAAATGGG TACTAACCAA GGTAAAGGTG CAGTCTAGGC GATCTGCAGA GTAAATTGAA ATTCTTCCTC TATATGTATA CCGATCGTAC TGACCACCTG TCGTTTACCC ATGATTGGTT CCATTTCCAC 601TAGTTGCTGC TGGAGATAAA CTGGCGTTGT TCTTGAAGGT ATTTGGCGGT GAAGTCCTGA CTGCGTTCGC TCGTACCTCC GTGACCACTT CTCGCCACAT ATCAACGACG ACCTCTATTT GACCGCAACA AGAACTTCCA TAAACCGCCA CTTCAGGACT GACGCAAGCG AGCATGGAGG CACTGGTGAA GAGCGGTGTA 701GGTACGTTCC ATCTCCAGCG GTAAATCCGC TCAGTTCCCT GTTCTGGGTC GCACTCAGGC AGCGTATCTG GCTCCGGGCG AGAACCTCGA CGATAAACGT CCATGCAAGG TAGAGGTCGC CATTTAGGCG AGTCAAGGGA CAAGACCCAG CGTGAGTCCG TCGCATAGAC CGAGGCCCGC TCTTGGAGCT GCTATTTGCA 801AAGGACATCA AACACACCGA GAAGGTAATC ACCATTGACG GTCTCCTGAC GGCTGACGTT CTGATTTATG ATATTGAGGA CGCGATGAAC CACTACGACG TTCCTGTAGT TTGTGTGGCT CTTCCATTAG TGGTAACTGC CAGAGGACTG CCGACTGCAA GACTAAATAC TATAACTCCT GCGCTACTTG GTGATGCTGC 901TTCGCTCTGA GTATACCTCT CAGTTGGGTG AATCTCTGGC GATGGCTGCG GATGGTGCGG TTCTGGCTGA GATTGCCGGT CTGTGTAACG TGGAAAGCAA AAGCGAGACT CATATGGAGA GTCAACCCAC TTAGAGACCG CTACCGACGC CTACCACGCC AAGACCGACT CTAACGGCCA GACACATTGC ACCTTTCGTT 1001ATATAATGAG AACATCGAGG GCTTAGGTAC TGCTACCGTA ATTGAGACCA CTCAGAACAA GGCCGCACTT ACCGACCAAG TTGCGCTGGG TAAGGAGATT TATATTACTC TTGTAGCTCC CGAATCCATG ACGATGGCAT TAACTCTGGT GAGTCTTGTT CCGGCGTGAA TGGCTGGTTC AACGCGACCC ATTCCTCTAA 1101ATTGCGGCTC TGACTAAGGC TCGTGCGGCT CTGACCAAGA ACTATGTTCC GGCTGCTGAC CGTGTGTTCT ACTGTGACCC AGATAGCTAC TCTGCGATTC TAACGCCGAG ACTGATTCCG AGCACGCCGA GACTGGTTCT TGATACAAGG CCGACGACTG GCACACAAGA TGACACTGGG TCTATCGATG AGACGCTAAG 1201TGGCAGCACT GATGCCGAAC GCAGCAAACT ACGCTGCTCT GATTGACCCT GAGAAGGGTT CTATCCGCAA CGTTATGGGC TTTGAGGTTG TAGAAGTTCC ACCGTCGTGA CTACGGCTTG CGTCGTTTGA TGCGACGAGA CTAACTGGGA CTCTTCCCAA GATAGGCGTT GCAATACCCG AAACTCCAAC ATCTTCAAGG 1301GCACCTCACC GCTGGTGGTG CTGGTACCGC TCGTGAGGGC ACTACTGGTC AGAAGCACGT CTTCCCTGCC AATAAAGGTG AGGGTAATGT CAAGGTTGCTCGTGGAGTGG CGACCACCAC GACCATGGCG AGCACTCCCG TGATGACCAG TCTTCGTGCA GAAGGGACGG TTATTTCCAC TCCCATTACA GTTCCAACGA 1401AAGGACAACG TTATCGGCCT GTTCATGCAC CGCTCTGCGG TAGGTACTGT TAAGCTGCGT GACTTGGCTC TGGAGCGCGC TCGCCGTGCT AACTTCCAAG TTCCTGTTGC AATAGCCGGA CAAGTACGTG GCGAGACGCC ATCCATGACA ATTCGACGCA CTGAACCGAG ACCTCGCGCG AGCGGCACGA TTGAAGGTTC                                                                                                    Esp3I                                                                                                   ~~~~~~1501CGGACCAGAT TATCGCTAAG TACGCAATGG GCCACGGTGG TCTTCGCCCA GAAGCTGCAG GAGCTGTCGT ATTCCAGTCA GGTTAATTAC GAGACGCTCG GCCTGGTCTA ATAGCGATTC ATGCGTTACC CGGTGCCACC AGAAGCGGGT CTTCGACGTC CTCGACAGCA TAAGGTCAGT CCAATTAATG CTCTGCGAGC 1601AGCCGATCCG CATCAAAGCA TGCTGTTTTC TGTCTGTCCC TAACATGCCC TGTGATTATC CGCAAACAAC ACACCCAAGG GCAGAACTTT GTTACTTAAA TCGGCTAGGC GTAGTTTCGT ACGACAAAAG ACAGACAGGG ATTGTACGGG ACACTAATAG GCGTTTGTTG TGTGGGTTCC CGTCTTGAAA CAATGAATTT 1701CACCATCCTG TTTGCTTCTT TCCTCAGGAA CTGTGGCTGC ACCATCTGTC TTCATCTTCC CGCCATCTGA TGAGCAGTTG AAATCTGGAA CTGCCTCTGT GTGGTAGGAC AAACGAAGAA AGGAGTCCTT GACACCGACG TGGTAGACAG AAGTAGAAGG GCGGTAGACT ACTCGTCAAC TTTAGACCTT GACGGAGACA 1801TGTGTGCCTG CTGAATAACT TCTATCCCAG AGAGGCCAAA GTACAGTGGA AGGTGGATAA CGCCCTCCAA TCGGGTAACT CCCAGGAGAG TGTCACAGAG ACACACGGAC GACTTATTGA AGATAGGGTC TCTCCGGTTT CATGTCACCT TCCACCTATT GCGGGAGGTT AGCCCATTGA GGGTCCTCTC ACAGTGTCTC 1901CAGGACAGCA AGGACAGCAC CTACAGCCTC AGCAGCACCC TGACGCTGAG CAAAGCAGAC TACGAGAAAC ACAAAGTCTA CGCCTGCGAA GTCACCCATC GTCCTGTCGT TCCTGTCGTG GATGTCGGAG TCGTCGTGGG ACTGCGACTC GTTTCGTCTG ATGCTCTTTG TGTTTCAGAT GCGGACGCTT CAGTGGGTAG 2001AGGGCCTGAG CTCGCCCGTC ACAAAGAGCT TCAACAGGGG AGAGTGTTAG CGGCCAATTG GCGGCCGCAA TTTAATTCCG GTTATTTTCC ACCATATTGC TCCCGGACTC GAGCGGGCAG TGTTTCTCGA AGTTGTCCCC TCTCACAATC GCCGGTTAAC CGCCGGCGTT AAATTAAGGC CAATAAAAGG TGGTATAACG 2101CGTCTTTTGG CAATGTGAGG GCCCGGAAAC CTGGCCCTGT CTTCTTGACG AGCATTCCTA GGGGTCTTTC CCCTCTCGCC AAAGGAATGC AAGGTCTGTT GCAGAAAACC GTTACACTCC CGGGCCTTTG GACCGGGACA GAAGAACTGC TCGTAAGGAT CCCCAGAAAG GGGAGAGCGG TTTCCTTACG TTCCAGACAA 2201GAATGTCGTG AAGGAAGCAG TTCCTCTGGA AGCTTCTTGA AGACAAACAA CGTCTGTAGC GACCCTTTGC AGGCAGCGGA ACCCCCCACC TGGCGACAGG CTTACAGCAC TTCCTTCGTC AAGGAGACCT TCGAAGAACT TCTGTTTGTT GCAGACATCG CTGGGAAACG TCCGTCGCCT TGGGGGGTGG ACCGCTGTCC 2301TGCCTCTGCG GCCAAAAGCC ACGTGTATAA GATACACCTG CAAAGGCGGC ACAACCCCAG TGCCACGTTG TGAGTTGGAT AGTTGTGGAA AGAGTCAAAT ACGGAGACGC CGGTTTTCGG TGCACATATT CTATGTGGAC GTTTCCGCCG TGTTGGGGTC ACGGTGCAAC ACTCAACCTA TCAACACCTT TCTCAGTTTA 2401GGCTCACCTC AAGCGTATTC AACAAGGGGC TGAAGGATGC CCAGAAGGTA CCCCATTGTA TGGGATCTGA TCTGGGGCCT CGGTGCACAT GCTTTACATGCCGAGTGGAG TTCGCATAAG TTGTTCCCCG ACTTCCTACG GGTCTTCCAT GGGGTAACAT ACCCTAGACT AGACCCCGGA GCCACGTGTA CGAAATGTAC2501TGTTTAGTCG AGGTTAAAAA ACGTCTAGGC CCCCCGAACC ACGGGGACGT GGTTTTCCTT TGAAAAACAC GATGATAATA TGGCCACCAC CCATACCTAGACAAATCAGC TCCAATTTTT TGCAGATCCG GGGGGCTTGG TGCCCCTGCA CCAAAAGGAA ACTTTTTGTG CTACTATTAT ACCGGTGGTG GGTATGGATC2601GCTTTTGCAA AGATCGATCA GATCCCGGGG GGCAATGAGA TATGAAAAAG CCTGAACTCA CCGCGACGTC TGTCGAGAAG TTTCTGATCG AAAAGTTCGACGAAAACGTT TCTAGCTAGT CTAGGGCCCC CCGTTACTCT ATACTTTTTC GGACTTGAGT GGCGCTGCAG ACAGCTCTTC AAAGACTAGC TTTTCAAGCT2701CAGCGTATCC GACCTGATGC AGCTCTCGGA GGGCGAAGAA TCTCGTGCTT TCAGCTTCGA TGTAGGAGGG CGTGGATATG TCCTGCGGGT AAATAGCTGCGTCGCATAGG CTGGACTACG TCGAGAGCCT CCCGCTTCTT AGAGCACGAA AGTCGAAGCT ACATCCTCCC GCACCTATAC AGGACGCCCA TTTATCGACG2801GCCGATGGTT TCTACAAAGA TCGTTATGTT TATCGGCACT TTGCATCGGC CGCGCTCCCG ATTCCGGAAG TGCTTGACAT TGGGGAATTC AGCGAGAGCCCGGCTACCAA AGATGTTTCT AGCAATACAA ATAGCCGTGA AACGTAGCCG GCGCGAGGGC TAAGGCCTTC ACGAACTGTA ACCCCTTAAG TCGCTCTCGG2901TGACCTATTG CATCTCCCGC CGTGCACAGG GTGTCACGTT GCAAGACCTG CCTGAAACCG AACTGCCCGC TGTTCTGCAG CCGGTCGCGG AGGCCATGGAACTGGATAAC GTAGAGGGCG GCACGTGTCC CACAGTGCAA CGTTCTGGAC GGACTTTGGC TTGACGGGCG ACAAGACGTC GGCCAGCGCC TCCGGTACCT3001TGCGATCGCT GCGGCCGATC TTAGCCAGAC GAGCGGGTTC GGCCCATTCG GACCGCAAGG AATCGGTCAA TACACTACAT GGCGTGATTT CATATGCGCGACGCTAGCGA CGCCGGCTAG AATCGGTCTG CTCGCCCAAG CCGGGTAAGC CTGGCGTTCC TTAGCCAGTT ATGTGATGTA CCGCACTAAA GTATACGCGC3101ATTGCTGATC CCCATGTGTA TCACTGGCAA ACTGTGATGG ACGACACCGT CAGTGCGTCC GTCGCGCAGG CTCTCGATGA GCTGATGCTT TGGGCCGAGGTAACGACTAG GGGTACACAT AGTGACCGTT TGACACTACC TGCTGTGGCA GTCACGCAGG CAGCGCGTCC GAGAGCTACT CGACTACGAA ACCCGGCTCC3201ACTGCCCCGA AGTCCGGCAC CTCGTGCACG CGGATTTCGG CTCCAACAAT GTCCTGACGG ACAATGGCCG CATAACAGCG GTCATTGACT GGAGCGAGGCTGACGGGGCT TCAGGCCGTG GAGCACGTGC GCCTAAAGCC GAGGTTGTTA CAGGACTGCC TGTTACCGGC GTATTGTCGC CAGTAACTGA CCTCGCTCCG3301GATGTTCGGG GATTCCCAAT ACGAGGTCGC CAACATCTTC TTCTGGAGGC CGTGGTTGGC TTGTATGGAG CAGCAGACGC GCTACTTCGA GCGGAGGCATCTACAAGCCC CTAAGGGTTA TGCTCCAGCG GTTGTAGAAG AAGACCTCCG GCACCAACCG AACATACCTC GTCGTCTGCG CGATGAAGCT CGCCTCCGTA3401CCGGAGCTTG CAGGATCGCC GCGGCTCCGG GCGTATATGC TCCGCATTGG TCTTGACCAA CTCTATCAGA GCTTGGTTGA CGGCAATTTC GATGATGCAGGGCCTCGAAC GTCCTAGCGG CGCCGAGGCC CGCATATACG AGGCGTAACC AGAACTGGTT GAGATAGTCT CGAACCAACT GCCGTTAAAG CTACTACGTC3501CTTGGGCGCA GGGTCGATGC GACGCAATCG TCCGATCCGG AGCCGGGACT GTCGGGCGTA CACAAATCGC CCGCAGAAGC GCGGCCGTCT GGACCGATGGGAACCCGCGT CCCAGCTACG CTGCGTTAGC AGGCTAGGCC TCGGCCCTGA CAGCCCGCAT GTGTTTAGCG GGCGTCTTCG CGCCGGCAGA CCTGGCTACC3601CTGTGTAGAA GTACTCGCCG ATAGTGGAAA CCGACGCCCC AGCACTCGTC CGGATCGGGA GATGGGGGAG GCTAACTGAA ACACGGAAGG AGACAATACCGACACATCTT CATGAGCGGC TATCACCTTT GGCTGCGGGG TCGTGAGCAG GCCTAGCCCT CTACCCCCTC CGATTGACTT TGTGCCTTCC TCTGTTATGG                                                                                                      I-SceI                                                                                                   ~~~~~~~~~~3701GGAAGGAACC TCGACGTTAA CTTGTTTATT GCAGCTTATA ATGGTTACAA ATAAAGCAAT AGCATCACAA ATTTCACAAA TAAAGCATTT ATTACCCTGT CCTTCCTTGG AGCTGCAATT GAACAAATAA CGTCGAATAT TACCAATGTT TATTTCGTTA TCGTAGTGTT TAAAGTGTTT ATTTCGTAAA TAATGGGACA  I-SceI ~~~~~~~~ 3801TATCCCTAGA ATTCACTGGC CGTCGTTTTA CAACGTCGTG ACTGGGAAAA CCCTGGCGTT ACCCAACTTA ATCGCCTTGC AGCACATCCC CCTTTCGCCA ATAGGGATCT TAAGTGACCG GCAGCAAAAT GTTGCAGCAC TGACCCTTTT GGGACCGCAA TGGGTTGAAT TAGCGGAACG TCGTGTAGGG GGAAAGCGGT 3901GCTGGCGTAA TAGCGAAGAG GCCCGCACCG ATCGCCCTTC CCAACAGTTG CGCAGCCTGA ATGGCGAATG GCGCCTGATG CGGTATTTTC TCCTTACGCA CGACCGCATT ATCGCTTCTC CGGGCGTGGC TAGCGGGAAG GGTTGTCAAC GCGTCGGACT TACCGCTTAC CGCGGACTAC GCCATAAAAG AGGAATGCGT 4001TCTGTGCGGT ATTTCACACC GCATACGTCA AAGCAACCAT AGTACGCGCC CTGTAGCGGC GCATTAAGCG CGGCGGGTGT GGTGGTTACG CGCAGCGTGA AGACACGCCA TAAAGTGTGG CGTATGCAGT TTCGTTGGTA TCATGCGCGG GACATCGCCG CGTAATTCGC GCCGCCCACA CCACCAATGC GCGTCGCACT 4101CCGCTACACT TGCCAGCGCC CTAGCGCCCG CTCCTTTCGC TTTCTTCCCT TCCTTTCTCG CCACGTTCGC CGGCTTTCCC CGTCAAGCTC TAAATCGGGG GGCGATGTGA ACGGTCGCGG GATCGCGGGC GAGGAAAGCG AAAGAAGGGA AGGAAAGAGC GGTGCAAGCG GCCGAAAGGG GCAGTTCGAG ATTTAGCCCC 4201GCTCCCTTTA GGGTTCCGAT TTAGTGCTTT ACGGCACCTC GACCCCAAAA AACTTGATTT GGGTGATGGT TCACGTAGTG GGCCATCGCC CTGATAGACG CGAGGGAAAT CCCAAGGCTA AATCACGAAA TGCCGTGGAG CTGGGGTTTT TTGAACTAAA CCCACTACCA AGTGCATCAC CCGGTAGCGG GACTATCTGC 4301GTTTTTCGCC CTTTGACGTT GGAGTCCACG TTCTTTAATA GTGGACTCTT GTTCCAAACT GGAACAACAC TCAACCCTAT CTCGGGCTAT TCTTTTGATT CAAAAAGCGG GAAACTGCAA CCTCAGGTGC AAGAAATTAT CACCTGAGAA CAAGGTTTGA CCTTGTTGTG AGTTGGGATA GAGCCCGATA AGAAAACTAA 4401TATAAGGGAT TTTGCCGATT TCGGCCTATT GGTTAAAAAA TGAGCTGATT TAACAAAAAT TTAACGCGAA TTTTAACAAA ATATTAACGT TTACAATTTT ATATTCCCTA AAACGGCTAA AGCCGGATAA CCAATTTTTT ACTCGACTAA ATTGTTTTTA AATTGCGCTT AAAATTGTTT TATAATTGCA AATGTTAAAA 4501ATGGTGCACT CTCAGTACAA TCTGCTCTGA TGCCGCATAG TTAAGCCAGC CCCGACACCC GCCAACACCC GCTGACGCGC CCTGACGGGC TTGTCTGCTC TACCACGTGA GAGTCATGTT AGACGAGACT ACGGCGTATC AATTCGGTCG GGGCTGTGGG CGGTTGTGGG CGACTGCGCG GGACTGCCCG AACAGACGAG 4601CCGGCATCCG CTTACAGACA AGCTGTGACC GTCTAGACGA AAGGGCCTCG TGATACGCCT ATTTTTATAG GTTAATGTCA TGATAATAAT GGTTTCTTAG GGCCGTAGGC GAATGTCTGT TCGACACTGG CAGATCTGCT TTCCCGGAGC ACTATGCGGA TAAAAATATC CAATTACAGT ACTATTATTA CCAAAGAATC 4701ACGTCAGGTG GCACTTTTCG GGGAAATGTG CGCGGAACCC CTATTTGTTT ATTTTTCTAA ATACATTCAA ATATGTATCC GCTCATGAGA CAATAACCCT TGCAGTCCAC CGTGAAAAGC CCCTTTACAC GCGCCTTGGG GATAAACAAA TAAAAAGATT TATGTAAGTT TATACATAGG CGAGTACTCT GTTATTGGGA 4801GATAAATGCT TCAATAATAT TGAAAAAGGA AGAGTATGAG TATTCAACAT TTCCGTGTCG CCCTTATTCC CTTTTTTGCG GCATTTTGCC TTCCTGTTTT CTATTTACGA AGTTATTATA ACTTTTTCCT TCTCATACTC ATAAGTTGTA AAGGCACAGC GGGAATAAGG GAAAAAACGC CGTAAAACGG AAGGACAAAA 4901TGCTCACCCA GAAACGCTGG TGAAAGTAAA AGATGCTGAA GATCAGTTGG GTGCACGAGT GGGTTACATC GAACTGGATC TCAACAGCGG TAAGATCCTT ACGAGTGGGT CTTTGCGACC ACTTTCATTT TCTACGACTT CTAGTCAACC CACGTGCTCA CCCAATGTAG CTTGACCTAG AGTTGTCGCC ATTCTAGGAA 5001GAGAGTTTTC GCCCCGAAGA ACGTTTTCCA ATGATGAGCA CTTTTAAAGT TCTGCTATGT GGCGCGGTAT TATCCCGTAT TGACGCCGGG CAAGAGCAAC CTCTCAAAAG CGGGGCTTCT TGCAAAAGGT TACTACTCGT GAAAATTTCA AGACGATACA CCGCGCCATA ATAGGGCATA ACTGCGGCCC GTTCTCGTTG 5101TCGGTCGCCG CATACACTAT TCTCAGAATG ACTTGGTTGA GTACTCACCA GTCACAGAAA AGCATCTTAC GGATGGCATG ACAGTAAGAG AATTATGCAG AGCCAGCGGC GTATGTGATA AGAGTCTTAC TGAACCAACT CATGAGTGGT CAGTGTCTTT TCGTAGAATG CCTACCGTAC TGTCATTCTC TTAATACGTC 5201TGCTGCCATA ACCATGAGTG ATAACACTGC GGCCAACTTA CTTCTGACAA CGATCGGAGG ACCGAAGGAG CTAACCGCTT TTTTGCACAA CATGGGGGAT ACGACGGTAT TGGTACTCAC TATTGTGACG CCGGTTGAAT GAAGACTGTT GCTAGCCTCC TGGCTTCCTC GATTGGCGAA AAAACGTGTT GTACCCCCTA 5301CATGTAACTC GCCTTGATCG TTGGGAACCG GAGCTGAATG AAGCCATACC AAACGACGAG CGTGACACCA CGATGCCTGT AGCAATGGCA ACAACGTTGC GTACATTGAG CGGAACTAGC AACCCTTGGC CTCGACTTAC TTCGGTATGG TTTGCTGCTC GCACTGTGGT GCTACGGACA TCGTTACCGT TGTTGCAACG 5401GCAAACTATT AACTGGCGAA CTACTTACTC TAGCTTCCCG GCAACAATTA ATAGACTGGA TGGAGGCGGA TAAAGTTGCA GGACCACTTC TGCGCTCGGCCGTTTGATAA TTGACCGCTT GATGAATGAG ATCGAAGGGC CGTTGTTAAT TATCTGACCT ACCTCCGCCT ATTTCAACGT CCTGGTGAAG ACGCGAGCCG 5501CCTTCCGGCT GGCTGGTTTA TTGCTGATAA ATCTGGAGCC GGTGAGCGTG GGTCTCGCGG TATCATTGCA GCACTGGGGC CAGATGGTAA GCCCTCCCGT GGAAGGCCGA CCGACCAAAT AACGACTATT TAGACCTCGG CCACTCGCAC CCAGAGCGCC ATAGTAACGT CGTGACCCCG GTCTACCATT CGGGAGGGCA 5601ATCGTAGTTA TCTACACGAC GGGGAGTCAG GCAACTATGG ATGAACGAAA TAGACAGATC GCTGAGATAG GTGCCTCACT GATTAAGCAT TGGTAACTGT TAGCATCAAT AGATGTGCTG CCCCTCAGTC CGTTGATACC TACTTGCTTT ATCTGTCTAG CGACTCTATC CACGGAGTGA CTAATTCGTA ACCATTGACA 5701CAGACCAAGT TTACTCATAT ATACTTTAGA TTGATTTAAA ACTTCATTTT TAATTTAAAA GGATCTAGGT GAAGATCCTT TTTGATAATC TCATGACCAA GTCTGGTTCA AATGAGTATA TATGAAATCT AACTAAATTT TGAAGTAAAA ATTAAATTTT CCTAGATCCA CTTCTAGGAA AAACTATTAG AGTACTGGTT 5801AATCCCTTAA CGTGAGTTTT CGTTCCACTG AGCGTCAGAC CCCGTAGAAA AGATCAAAGG ATCTTCTTGA GATCCTTTTT TTCTGCGCGT AATCTGCTGC TTAGGGAATT GCACTCAAAA GCAAGGTGAC TCGCAGTCTG GGGCATCTTT TCTAGTTTCC TAGAAGAACT CTAGGAAAAA AAGACGCGCA TTAGACGACG 5901TTGCAAACAA AAAAACCACC GCTACCAGCG GTGGTTTGTT TGCCGGATCA AGAGCTACCA ACTCTTTTTC CGAAGGTAAC TGGCTTCAGC AGAGCGCAGA AACGTTTGTT TTTTTGGTGG CGATGGTCGC CACCAAACAA ACGGCCTAGT TCTCGATGGT TGAGAAAAAG GCTTCCATTG ACCGAAGTCG TCTCGCGTCT 6001TACCAAATAC TGTCCTTCTA GTGTAGCCGT AGTTAGGCCA CCACTTCAAG AACTCTGTAG CACCGCCTAC ATACCTCGCT CTGCTAATCC TGTTACCAGT ATGGTTTATG ACAGGAAGAT CACATCGGCA TCAATCCGGT GGTGAAGTTC TTGAGACATC GTGGCGGATG TATGGAGCGA GACGATTAGG ACAATGGTCA 6101GGCTGCTGCC AGTGGCGATA AGTCGTGTCT TACCGGGTTG GACTCAAGAC GATAGTTACC GGATAAGGCG CAGCGGTCGG GCTGAACGGG GGGTTCGTGCCCGACGACGG TCACCGCTAT TCAGCACAGA ATGGCCCAAC CTGAGTTCTG CTATCAATGG CCTATTCCGC GTCGCCAGCC CGACTTGCCC CCCAAGCACG6201ACACAGCCCA GCTTGGAGCG AACGACCTAC ACCGAACTGA GATACCTACA GCGTGAGCTA TGAGAAAGCG CCACGCTTCC CGAAGGGAGA AAGGCGGACATGTGTCGGGT CGAACCTCGC TTGCTGGATG TGGCTTGACT CTATGGATGT CGCACTCGAT ACTCTTTCGC GGTGCGAAGG GCTTCCCTCT TTCCGCCTGT6301GGTATCCGGT AAGCGGCAGG GTCGGAACAG GAGAGCGCAC GAGGGAGCTT CCAGGGGGAA ACGCCTGGTA TCTTTATAGT CCTGTCGGGT TTCGCCACCTCCATAGGCCA TTCGCCGTCC CAGCCTTGTC CTCTCGCGTG CTCCCTCGAA GGTCCCCCTT TGCGGACCAT AGAAATATCA GGACAGCCCA AAGCGGTGGA6401CTGACTTGAG CGTCGATTTT TGTGATGCTC GTCAGGGGGG CGGAGCCTAT GGAAAAACGC CAGCAACGCG GCCTTTTTAC GGTTCCTGGC CTTTTGCTGGGACTGAACTC GCAGCTAAAA ACACTACGAG CAGTCCCCCC GCCTCGGATA CCTTTTTGCG GTCGTTGCGC CGGAAAAATG CCAAGGACCG GAAAACGACC6501CCTTTTGCTC ACATGTTCTT TCCTGCGTTA TCCCCTGATT CTGTGGATAA CCGTATTACC GCCTTTGAGT GAGCTGATAC CGCTCGCCGC AGCCGAACGAGGAAAACGAG TGTACAAGAA AGGACGCAAT AGGGGACTAA GACACCTATT GGCATAATGG CGGAAACTCA CTCGACTATG GCGAGCGGCG TCGGCTTGCT6601CCGAGCGCAG CGAGTCAGTG AGCGAGGAAG CGGAAGAGCG CCCAATACGC AAACCGCCTC TCCCCGCGCG TTGGCCGATT CATTAATGCA GCTGGCACGAGGCTCGCGTC GCTCAGTCAC TCGCTCCTTC GCCTTCTCGC GGGTTATGCG TTTGGCGGAG AGGGGCGCGC AACCGGCTAA GTAATTACGT CGACCGTGCT6701CAGGTTTCCC GACTGGAAAG CGGGCAGTGA GCGCAACGCA ATTAATGTGA GTTAGCTCAC TCATTAGGCA CCCCAGGCTT TACACTTTAT GCTTCCGGCTGTCCAAAGGG CTGACCTTTC GCCCGTCACT CGCGTTGCGT TAATTACACT CAATCGAGTG AGTAATCCGT GGGGTCCGAA ATGTGAAATA CGAAGGCCGA                                                                                        I-SceI                                                                                  ~~~~~~~~~~~~~~~~~~~~6801CGTATGTTGT GTGGAATTGT GAGCGGATAA CAATTTCACA CAGGAAACAG CTATGACCAT GATTACGCCA AGCTTTAGGG ATAACAGGGT AATCGCCATGGCATACAACA CACCTTAACA CTCGCCTATT GTTAAAGTGT GTCCTTTGTC GATACTGGTA CTAATGCGGT TCGAAATCCC TATTGTCCCA TTAGCGGTAC6901CATTAGTTAT TAATAGTAAT CAATTACGGG GTCATTAGTT CATAGCCCAT ATATGGAGTT CCGCGTTACA TAACTTACGG TAAAGTAATCAATA ATTATCATTA GTTAATGCCC CAGTAATCAA GTATCGGGTA TATACCTCAA GGCGCAATGT ATTGAATGCC ATTTpVHentry-Cm5:                               Esp3I                              ~~~~~~~ 1GGTTTAGTGA ACCGTCAGAT CCGCTAGACG TCTCATATAC CTGACTGGAA TACGACAGCT CCTGCAGCTT CTGGGCGAAG ACCACCGTGG CCCATTGCGTCCAAATCACT TGGCAGTCTA GGCGATCTGC AGAGTATATG GACTGACCTT ATGCTGTCGA GGACGTCGAA GACCCGCTTC TGGTGGCACC GGGTAACGCA101ACTTAGCGAT AATCTGGTCC GCTTGGAAGT TAGCACGGCG AGCGCGCTCC AGAGCCAAGT CACGCAGCTT AACAGTACCT ACCGCAGAGC GGTGCATGAATGAATCGCTA TTAGACCAGG CGAACCTTCA ATCGTGCCGC TCGCGCGAGG TCTCGGTTCA GTGCGTCGAA TTGTCATGGA TGGCGTCTCG CCACGTACTT 201CAGGCCGATA ACGTTGTCCT TAGCAACCTT GACATTACCC TCACCTTTAT TGGCAGGGAA GACGTGCTTC TGACCAGTAG TGCCCTCACG AGCGGTACCA GTCCGGCTAT TGCAACAGGA ATCGTTGGAA CTGTAATGGG AGTGGAAATA ACCGTCCCTT CTGCACGAAG ACTGGTCATC ACGGGAGTGC TCGCCATGGT 301GCACCACCAG CGGTGAGGTG CGGAACTTCT ACAACCTCAA AGCCCATAAC GTTGCGGATA GAACCCTTCT CAGGGTCAAT CAGAGCAGCG TAGTTTGCTG CGTGGTGGTC GCCACTCCAC GCCTTGAAGA TGTTGGAGTT TCGGGTATTG CAACGCCTAT CTTGGGAAGA GTCCCAGTTA GTCTCGTCGC ATCAAACGAC 401CGTTCGGCAT CAGTGCTGCC AGAATCGCAG AGTAGCTATC TGGGTCACAG TAGAACACAC GGTCAGCAGC CGGAACATAG TTCTTGGTCA GAGCCGCACG GCAAGCCGTA GTCACGACGG TCTTAGCGTC TCATCGATAG ACCCAGTGTC ATCTTGTGTG CCAGTCGTCG GCCTTGTATC AAGAACCAGT CTCGGCGTGC 501AGCCTTAGTC AGAGCCGCAA TAATCTCCTT ACCCAGCGCA ACTTGGTCGG TAAGTGCGGC CTTGTTCTGA GTGGTCTCAA TTACGGTAGC AGTACCTAAG TCGGAATCAG TCTCGGCGTT ATTAGAGGAA TGGGTCGCGT TGAACCAGCC ATTCACGCCG GAACAAGACT CACCAGAGTT AATGCCATCG TCATGGATTC 601CCCTCGATGT TCTCATTATA TTTGCTTTCC ACGTTACACA GACCGGCAAT CTCAGCCAGA ACCGCACCAT CCGCAGCCAT CGCCAGAGAT TCACCCAACT GGGAGCTACA AGAGTAATAT AAACGAAAGG TGCAATGTGT CTGGCCGTTA GAGTCGGTCT TGGCGTGGTA GGCGTCGGTA GCGGTCTCTA AGTGGGTTGA 701GAGAGGTATA CTCAGAGCGA ACGTCGTAGT GGTTCATCGC GTCCTCAATA TCATAAATCA GAACGTCAGC CGTCAGGAGA CCGTCAATGG TGATTACCTT CTCTCCATAT GAGTCTCGCT TGCAGCATCA CCAAGTAGCG CAGGAGTTAT AGTATTTAGT CTTGCAGTCG GCAGTCCTCT GGCAGTTACC ACTAATGGAA 801CTCGGTGTGT TTGATGTCCT TACGTTTATC GTCGAGGTTC TCGCCCGGAG CCAGATACGC TGCCTGAGTG CGACCCAGAA CAGGGAACTG AGCGGATTTA GAGCCACACA AACTACAGGA ATGCAAATAG CAGCTCCAAG AGCGGGCCTC GGTCTATGCG ACGGACTCAC GCTGGGTCTT GTCCCTTGAC TCGCCTAAAT 901CCGCTGGAGA TGGAACGTAC CATGTGGCGA GAAGTGGTCA CGGAGGTACG AGCGAACGCA GTCAGGACTT CACCGCCAAA TACCTTCAAG AACAACGCCA GGCGACCTCT ACCTTGCATG GTACACCGCT CTTCACCAGT GCCTCCATGC TCGCTTGCGT CAGTCCTGAA GTGGCGGTTT ATGGAAGTTC TTGTTGCGGT                                                                                                         Esp3I                                                                                                        ~~~~~1001GTTTATCTCC AGCAGCAACT ACACCTTTAC CTTGGTTAGT ACCCATTTGC TGTCCACCAG TCATGCTAGC CATATGTATA TCTCCTTCTT AAAGTCGTCT CAAATAGAGG TCGTCGTTGA TGTGGAAATG GAACCAATCA TGGGTAAACG ACAGGTGGTC AGTACGATCG GTATACATAT AGAGGAAGAA TTTCAGCAGA Esp3I ~ 1101CCAGTGCCTC CACCAAGGGC CCATCGGTCT TCCCCCTGGC GCCCTGCTCC AGGAGCACCT CCGAGAGCAC AGCGGCCCTG GGCTGCCTGG TCAAGGACTA GGTCACGGAG GTGGTTCCCG GGTAGCCAGA AGGGGGACCG CGGGACGAGG TCCTCGTGGA GGCTCTCGTG TCGCCGGGAC CCGACGGACC AGTTCCTGAT 1201CTTCCCCGAA CCGGTGACGG TGTCGTGGAA CTCAGGCGCT CTGACCAGCG GCGTGCACAC CTTCCCAGCT GTCCTACAGT CCTCAGGACT CTACTCCCTC GAAGGGGCTT GGCCACTGCC ACAGCACCTT GAGTCCGCGA GACTGGTCGC CGCACGTGTG GAAGGGTCGA CAGGATGTCA GGAGTCCTGA GATGAGGGAG 1301AGCAGCGTGG TGACCGTGCC CTCCAGCAGC TTGGGCACCC AGACCTACAT CTGCAACGTG AATCACAAGC CCAGCAACAC CAAGGTGGAC AAGAAAGTTG TCGTCGCACC ACTGGCACGG GAGGTCGTCG AACCCGTGGG TCTGGATGTA GACGTTGCAC TTAGTGTTCG GGTCGTTGTG GTTCCACCTG TTCTTTCAAC 1401AGCCCAAATC TTGTGACAAA ACTCACACAT GCCCACCGTG CCCAGCACCT GAACTCCTGG GGGGACCGTC AGTCTTCCTC TTCCCCCCMA AACCCAAGGA TCGGGTTTAG AACACTGTTT TGAGTGTGTA CGGGTGGCAC GGGTCGTGGA CTTGAGGACC CCCCTGGCAG TCAGAAGGAG AAGGGGGGKT TTGGGTTCCT 1501CACCCTCATG ATCTCCCGGA CCCCTGAGGT CACATGCGTG GTGGTGGACG TGAGCCACGA AGACCCTGAG GTCAAGTTCA ACTGGTACGT GGACGGCGTG GTGGGAGTAC TAGAGGGCCT GGGGACTCCA GTGTACGCAC CACCACCTGC ACTCGGTGCT TCTGGGACTC CAGTTCAAGT TGACCATGCA CCTGCCGCAC 1601GAGGTGCATA ATGCCAAGAC AAAGCCGCGG GAGGAGCAGT ACAACAGCAC GTACCGTGTG GTCAGCGTCC TCACCGTCCT GCACCAGGAC TGGCTGAATG CTCCACGTAT TACGGTTCTG TTTCGGCGCC CTCCTCGTCA TGTTGTCGTG CATGGCACAC CAGTCGCAGG AGTGGCAGGA CGTGGTCCTG ACCGACTTAC 1701GCAAGGAGTA CAAGTGCAAG GTCTCCAACA AAGCCCTCCC AGCCCCCATC GAGAAAACCA TCTCCAAAGC CAAAGGGCAG CCCCGAGAAC CACAGGTGTA CGTTCCTCAT GTTCACGTTC CAGAGGTTGT TTCGGGAGGG TCGGGGGTAG CTCTTTTGGT AGAGGTTTCG GTTTCCCGTC GGGGCTCTTG GTGTCCACAT 1801CACCCTGCCC CCATCCCGGG ATGAGCTGAC CAAGAACCAG GTCAGCCTGA CCTGCCTGGT CAAAGGCTTC TATCCCAGCG ACATCGCCGT GGAGTGGGAG GTGGGACGGG GGTAGGGCCC TACTCGACTG GTTCTTGGTC CAGTCGGACT GGACGGACCA GTTTCCGAAG ATAGGGTCGC TGTAGCGGCA CCTCACCCTC 1901AGCAATGGGC AGCCGGAGAA CAACTACAAG ACCACGCCTC CCGTGCTGGA CTCCGACGGC TCCTTCTTCC TCTACAGCAA GCTCACCGTG GACAAGAGCA TCGTTACCCG TCGGCCTCTT GTTGATGTTC TGGTGCGGAG GGCACGACCT GAGGCTGCCG AGGAAGAAGG AGATGTCGTT CGAGTGGCAC CTGTTCTCGT 2001GGTGGCAGCA GGGGAACGTC TTCTCATGCT CCGTGATGCA TGAGGCTCTG CACAACCACT ACACGCAGAA GAGCCTCTCC CTGTCTCCGG GTAAATGAGC CCACCGTCGT CCCCTTGCAG AAGAGTACGA GGCACTACGT ACTCCGAGAC GTGTTGGTGA TGTGCGTCTT CTCGGAGAGG GACAGAGGCC CATTTACTCG 2101GGCCGCAATT TAATTCCGGT TATTTTCCAC CATATTGCCG TCTTTTGGCA ATGTGAGGGC CCGGAAACCT GGCCCTGTCT TCTTGACGAG CATTCCTAGG CCGGCGTTAA ATTAAGGCCA ATAAAAGGTG GTATAACGGC AGAAAACCGT TACACTCCCG GGCCTTTGGA CCGGGACAGA AGAACTGCTC GTAAGGATCC 2201GGTCTTTCCC CTCTCGCCAA AGGAATGCAA GGTCTGTTGA ATGTCGTGAA GGAAGCAGTT CCTCTGGAAG CTTCTTGAAG ACAAACAACG TCTGTAGCGA CCAGAAAGGG GAGAGCGGTT TCCTTACGTT CCAGACAACT TACAGCACTT CCTTCGTCAA GGAGACCTTC GAAGAACTTC TGTTTGTTGC AGACATCGCT 2301CCCTTTGCAG GCAGCGGAAC CCCCCACCTG GCGACAGGTG CCTCTGCGGC CAAAAGCCAC GTGTATAAGA TACACCTGCA AAGGCGGCAC AACCCCAGTG GGGAAACGTC CGTCGCCTTG GGGGGTGGAC CGCTGTCCAC GGAGACGCCG GTTTTCGGTG CACATATTCT ATGTGGACGT TTCCGCCGTG TTGGGGTCAC 2401CCACGTTGTG AGTTGGATAG TTGTGGAAAG AGTCAAATGG CTCACCTCAA GCGTATTCAA CAAGGGGCTG AAGGATGCCC AGAAGGTACC CCATTGTATG GGTGCAACAC TCAACCTATC AACACCTTTC TCAGTTTACC GAGTGGAGTT CGCATAAGTT GTTCCCCGAC TTCCTACGGG TCTTCCATGG GGTAACATAC 2501GGATCTGATC TGGGGCCTCG GTGCACATGC TTTACATGTG TTTAGTCGAG GTTAAAAAAC GTCTAGGCCC CCCGAACCAC GGGGACGTGG TTTTCCTTTG CCTAGACTAG ACCCCGGAGC CACGTGTACG AAATGTACAC AAATCAGCTC CAATTTTTTG CAGATCCGGG GGGCTTGGTG CCCCTGCACC AAAAGGAAAC 2601AAAAACACGA TGATAATATG GCCACCACCC ATACCTAGGC TTTTGCAAAG ATCGATCAAG AGACAGGATG AGGATCGTTT CGCATGATTG AACAAGATGGTTTTTGTGCT ACTATTATAC CGGTGGTGGG TATGGATCCG AAAACGTTTC TAGCTAGTTC TCTGTCCTAC TCCTAGCAAA GCGTACTAAC TTGTTCTACC2701ATTGCACGCA GGTTCTCCGG CCGCTTGGGT GGAGAGGCTA TTCGGCTATG ACTGGGCACA ACAGACAATC GGCTGCTCTG ATGCCGCCGT GTTCCGGCTGTAACGTGCGT CCAAGAGGCC GGCGAACCCA CCTCTCCGAT AAGCCGATAC TGACCCGTGT TGTCTGTTAG CCGACGAGAC TACGGCGGCA CAAGGCCGAC2801TCAGCGCAGG GGCGCCCGGT TCTTTTTGTC AAGACCGACC TGTCCGGTGC CCTGAATGAA CTGCAAGACG AGGCAGCGCG GCTATCGTGG CTGGCCACGAAGTCGCGTCC CCGCGGGCCA AGAAAAACAG TTCTGGCTGG ACAGGCCACG GGACTTACTT GACGTTCTGC TCCGTCGCGC CGATAGCACC GACCGGTGCT2901CGGGCGTTCC TTGCGCAGCT GTGCTCGACG TTGTCACTGA AGCGGGAAGG GACTGGCTGC TATTGGGCGA AGTGCCGGGG CAGGATCTCC TGTCATCTCAGCCCGCAAGG AACGCGTCGA CACGAGCTGC AACAGTGACT TCGCCCTTCC CTGACCGACG ATAACCCGCT TCACGGCCCC GTCCTAGAGG ACAGTAGAGT3001CCTTGCTCCT GCCGAGAAAG TATCCATCAT GGCTGATGCA ATGCGGCGGC TGCATACGCT TGATCCGGCT ACCTGCCCAT TCGACCACCA AGCGAAACATGGAACGAGGA CGGCTCTTTC ATAGGTAGTA CCGACTACGT TACGCCGCCG ACGTATGCGA ACTAGGCCGA TGGACGGGTA AGCTGGTGGT TCGCTTTGTA3101CGCATCGAGC GAGCACGTAC TCGGATGGAA GCCGGTCTTG TCGATCAGGA TGATCTGGAC GAAGAGCATC AGGGGCTCGC GCCAGCCGAA CTGTTCGCCAGCGTAGCTCG CTCGTGCATG AGCCTACCTT CGGCCAGAAC AGCTAGTCCT ACTAGACCTG CTTCTCGTAG TCCCCGAGCG CGGTCGGCTT GACAAGCGGT3201GGCTCAAGGC GAGCATGCCC GACGGCGAGG ATCTCGTCGT GACCCATGGC GATGCCTGCT TGCCGAATAT CATGGTGGAA AATGGCCGCT TTTCTGGATTCCGAGTTCCG CTCGTACGGG CTGCCGCTCC TAGAGCAGCA CTGGGTACCG CTACGGACGA ACGGCTTATA GTACCACCTT TTACCGGCGA AAAGACCTAA3301CATCGACTGT GGCCGGCTGG GTGTGGCGGA CCGCTATCAG GACATAGCGT TGGCTACCCG TGATATTGCT GAAGAGCTTG GCGGCGAATG GGCTGACCGCGTAGCTGACA CCGGCCGACC CACACCGCCT GGCGATAGTC CTGTATCGCA ACCGATGGGC ACTATAACGA CTTCTCGAAC CGCCGCTTAC CCGACTGGCG3401TTCCTCGTGC TTTACGGTAT CGCCGCTCCC GATTCGCAGC GCATCGCCTT CTATCGCCTT CTTGACGAGT TCTTCTGAGC GGGACTCTGG GGTTCGGGCCAAGGAGCACG AAATGCCATA GCGGCGAGGG CTAAGCGTCG CGTAGCGGAA GATAGCGGAA GAACTGCTCA AGAAGACTCG CCCTGAGACC CCAAGCCCGG3501GCACTCGAGC ATAAACTTGT TTATTGCAGC TTATAATGGT TACAAATAAA GCAATAGCAT CACAAATTTC ACAAATAAAG CATTTTTTTC ACTGCATTCTCGTGAGCTCG TATTTGAACA AATAACGTCG AATATTACCA ATGTTTATTT CGTTATCGTA GTGTTTAAAG TGTTTATTTC GTAAAAAAAG TGACGTAAGAI-SceI ~~~~~~~~~~~~~~~~~~~~ 3601AGTTGTGGTT TGTCCAAACT CATCAATGTA TCTTAAGTAG GGATAACAGG GTAATTTTGT TAAATCAGCT CATTTTTTAA CCAATAGGAA CGCCATCAAATCAACACCAA ACAGGTTTGA GTAGTTACAT AGAATTCATC CCTATTGTCC CATTAAAACA ATTTAGTCGA GTAAAAAATT GGTTATCCTT GCGGTAGTTT3701AATAATTCGC GTCTGGCCTT CCTGTAGCCA GCTTTCATCA ACATTAAATG TGAGCGAGTA ACAACCCGTC GGATTCTCCG TGGGAACAAA CGGCGGATTGTTATTAAGCG CAGACCGGAA GGACATCGGT CGAAAGTAGT TGTAATTTAC ACTCGCTCAT TGTTGGGCAG CCTAAGAGGC ACCCTTGTTT GCCGCCTAAC3801ACCGTAATGG GATAGGTTAC GTTGGTGTAG ATGGGCGCAT CGTAACCGTG CATCTGCCAG TTTGAGGGGA CGACGACCGT ATCGGCCTCA GGAAGATCGCTGGCATTACC CTATCCAATG CAACCACATC TACCCGCGTA GCATTGGCAC GTAGACGGTC AAACTCCCCT GCTGCTGGCA TAGCCGGAGT CCTTCTAGCG3901ACTCCAGCCA GCTTTCCGGC ACCGCTTCTG GTGCCGGAAA CCAGGCAAAG CGCCATTCGC CATTCAGGCT GCGCAACTGT TGGGAAGGGC GATCGGTGCG TGAGGTCGGT CGAAAGGCCG TGGCGAAGAC CACGGCCTTT GGTCCGTTTC GCGGTAAGCG GTAAGTCCGA CGCGTTGACA ACCCTTCCCG CTAGCCACGC 4001GGCCTCTTCG CTATTACGCC AGCTGGCGAA AGGGGGATGT GCTGCAAGGC GATTAAGTTG GGTAACGCCA GGGTTTTCCC AGTCACGACG TTGTAAAACG CCGGAGAAGC GATAATGCGG TCGACCGCTT TCCCCCTACA CGACGTTCCG CTAATTCAAC CCATTGCGGT CCCAAAAGGG TCAGTGCTGC AACATTTTGC 4101ACGGCCAGTG AATTGCAATT CGTAATCATG GTCATAGCTG TTTCCTGTGT GAAATTGTTA TCCGCTCACA ATTCCACACA ACATACGAGC CGGAAGCATA TGCCGGTCAC TTAACGTTAA GCATTAGTAC CAGTATCGAC AAAGGACACA CTTTAACAAT AGGCGAGTGT TAAGGTGTGT TGTATGCTCG GCCTTCGTAT                                                                              I-SceI                                                                       ~~~~~~~~~~~~~~~~~~~~4201AAGTGTAAAG CCTGGGGTGC CTAATGAGTG AGCTAACTCA CATTAATTGC GTTGCGCTCA CTGCCATTAC CCTGTTATCC CTAGTGAACC ATCACCCTAA TTCACATTTC GGACCCCACG GATTACTCAC TCGATTGAGT GTAATTAACG CAACGCGAGT GACGGTAATG GGACAATAGG GATCACTTGG TAGTGGGATT 4301TCAAGTTTTT TGGGGTCGAG GTGCCGTAAA GCACTAAATC GGAACCCTAA AGGGAGCCCC CGATTTAGAG CTTGACGGGG AAAGCCGGCG AACGTGGCGA AGTTCAAAAA ACCCCAGCTC CACGGCATTT CGTGATTTAG CCTTGGGATT TCCCTCGGGG GCTAAATCTC GAACTGCCCC TTTCGGCCGC TTGCACCGCT 4401GAAAGGAAGG GAAGAAAGCG AAAGGAGCGG GCGCTAGGGC GCTGGCAAGT GTAGCGGTCA CGCTGCGCGT AACCACCACA CCCGCCGCGC TTAATGCGCC CTTTCCTTCC CTTCTTTCGC TTTCCTCGCC CGCGATCCCG CGACCGTTCA CATCGCCAGT GCGACGCGCA TTGGTGGTGT GGGCGGCGCG AATTACGCGG 4501GCTACAGGGC GCGTCAGGTG GCACTTTTCG GGGAAATGTG CGCGGAACCC CTATTTGTTT ATTTTTCTAA ATACATTCAA ATATGTATCC GCTCATGAGA CGATGTCCCG CGCAGTCCAC CGTGAAAAGC CCCTTTACAC GCGCCTTGGG GATAAACAAA TAAAAAGATT TATGTAAGTT TATACATAGG CGAGTACTCT 4601CAATAACCCT GATAAATGCT TCAATAATAA CGACCGGTAA TGAAAAAGGA AGAGTATGAG TATTCAACAT TTCCGTGTCG CCCTTATTCC CTTTTTTGCG GTTATTGGGA CTATTTACGA AGTTATTATT GCTGGCCATT ACTTTTTCCT TCTCATACTC ATAAGTTGTA AAGGCACAGC GGGAATAAGG GAAAAAACGC 4701GCATTTTGCC TTCCTGTTTT TGCTCACCCA GAAACGCTGG TGAAAGTAAA AGATGCTGAA GATCAGTTGG GTGCACGAGT GGGTTACATC GAACTGGATC CGTAAAACGG AAGGACAAAA ACGAGTGGGT CTTTGCGACC ACTTTCATTT TCTACGACTT CTAGTCAACC CACGTGCTCA CCCAATGTAG CTTGACCTAG 4801TCAACAGCGG TAAGATCCTT GAGAGTTTTC GCCCCGAAGA ACGTTTTCCA ATGATGAGCA CTTTTAAAGT TCTGCTATGT GGCGCGGTAT TATCCCGTAT AGTTGTCGCC ATTCTAGGAA CTCTCAAAAG CGGGGCTTCT TGCAAAAGGT TACTACTCGT GAAAATTTCA AGACGATACA CCGCGCCATA ATAGGGCATA 4901TGACGCCGGG CAAGAGCAAC TCGGTCGCCG CATACACTAT TCTCAGAATG ACTTGGTTGA GTCTAGCGTT GATCGGCACG TAAGAGGTTC CAACTTTCAC ACTGCGGCCC GTTCTCGTTG AGCCAGCGGC GTATGTGATA AGAGTCTTAC TGAACCAACT CAGATCGCAA CTAGCCGTGC ATTCTCCAAG GTTGAAAGTG 5001CATAATGAAA TAAGATCACT ACCGGGCGTA TTTTTTGAGT TATCGAGATT TTCAGGAGCT AAGGAAGCTA AAATGGAGAA AAAAATCACT GGATATACCA GTATTACTTT ATTCTAGTGA TGGCCCGCAT AAAAAACTCA ATAGCTCTAA AAGTCCTCGA TTCCTTCGAT TTTACCTCTT TTTTTAGTGA CCTATATGGT 5101CCGTTGATAT ATCCCAATGG CATCGTAAAG AACATTTTGA GGCATTTCAG TCAGTTGCTC AATGTACCTA TAACCAGACC GTTCAGCTGG ATATTACGGC GGCAACTATA TAGGGTTACC GTAGCATTTC TTGTAAAACT CCGTAAAGTC AGTCAACGAG TTACATGGAT ATTGGTCTGG CAAGTCGACC TATAATGCCG 5201CTTTTTAAAG ACCGTAAAGA AAAATAAGCA CAAGTTTTAT CCGGCCTTTA TTCACATTCT TGCCCGCCTG ATGAATGCTC ATCCGGAATT CCGTATGGCA GAAAAATTTC TGGCATTTCT TTTTATTCGT GTTCAAAATA GGCCGGAAAT AAGTGTAAGA ACGGGCGGAC TACTTACGAG TAGGCCTTAA GGCATACCGT 5301ATGAAAGACG GTGAGCTGGT GATATGGGAT AGTGTTCACC CTTGTTACAC CGTTTTCCAT GAGCAAACTG AAACGTTTTC ATCGCTCTGG AGTGAATACC TACTTTCTGC CACTCGACCA CTATACCCTA TCACAAGTGG GAACAATGTG GCAAAAGGTA CTCGTTTGAC TTTGCAAAAG TAGCGAGACC TCACTTATGG 5401ACGACGATTT CCGGCAGTTT CTACACATAT ATTCGCAAGA TGTGGCGTGT TACGGTGAAA ACCTGGCCTA TTTCCCTAAA GGGTTTATTG AGAATATGTT TGCTGCTAAA GGCCGTCAAA GATGTGTATA TAAGCGTTCT ACACCGCACA ATGCCACTTT TGGACCGGAT AAAGGGATTT CCCAAATAAC TCTTATACAA 5501TTTCGTATCA GCCAATCCCT GGGTGAGTTT CACCAGTTTT GATTTAAACG TGGCCAATAT GGACAACTTC TTCGCCCCCG TTTTCACCAT GGGCAAATAT AAAGCATAGT CGGTTAGGGA CCCACTCAAA GTGGTCAAAA CTAAATTTGC ACCGGTTATA CCTGTTGAAG AAGCGGGGGC AAAAGTGGTA CCCGTTTATA 5601TATACGCAAG GCGACAAGGT GCTGATGCCG CTGGCGATTC AGGTTCATCA TGCCGTCTGT GATGGCTTCC ATGTCGGCAG AATGCTTAAT GAATTACAAC ATATGCGTTC CGCTGTTCCA CGACTACGGC GACCGCTAAG TCCAAGTAGT ACGGCAGACA CTACCGAAGG TACAGCCGTC TTACGAATTA CTTAATGTTG 5701AGTACTGCGA TGAGTGGCAG GGCGGGGCGT AATTTTTTTA AGGCAGTTAT TGGTGCCCTT AAACGCCTGG TGCTACGCCT GAATAAGTGA TAATAAGCGG TCATGACGCT ACTCACCGTC CCGCCCCGCA TTAAAAAAAT TCCGTCAATA ACCACGGGAA TTTGCGGACC ACGATGCGGA CTTATTCACT ATTATTCGCC 5801ATGAATGGCA GAAATTCGAA ATGACCGACC AAGCGACGCC CAACCTGCCA TCACGAGATT TCGATTCCAC CGCCGCCTTC TATGAAAGGT TGGGCTTCGG TACTTACCGT CTTTAAGCTT TACTGGCTGG TTCGCTGCGG GTTGGACGGT AGTGCTCTAA AGCTAAGGTG GCGGCGGAAG ATACTTTCCA ACCCGAAGCC 5901AATCGTTTTC CGGGACGCCG GCTGGATGAT CCTCCAGCGC GGGGATCTCA TGCTGGAGTT CTTCGCCCAC CCTAGGGGGA GGCTAACTGA AACACGGAAG TTAGCAAAAG GCCCTGCGGC CGACCTACTA GGAGGTCGCG CCCCTAGAGT ACGACCTCAA GAAGCGGGTG GGATCCCCCT CCGATTGACT TTGTGCCTTC 6001GAGACAATAC CGGAAGGAAC CCGCGCTATG ACGGCAATAA AAAGACAGAA TAAAACGCAC GGTGTTGGGT CGTTTGTTCA TAAACGCGGG GTTCGGTCCC CTCTGTTATG GCCTTCCTTG GGCGCGATAC TGCCGTTATT TTTCTGTCTT ATTTTGCGTG CCACAACCCA GCAAACAAGT ATTTGCGCCC CAAGCCAGGG 6101AGGGCTGGCA CTCTGTCGAT ACCCCACCGA GACCCCATTG GGGCCAATAC GCCCGCGTTT CTTCCTTTTC CCCACCCCAC CCCCCAAGTT CGGGTGAAGG TCCCGACCGT GAGACAGCTA TGGGGTGGCT CTGGGGTAAC CCCGGTTATG CGGGCGCAAA GAAGGAAAAG GGGTGGGGTG GGGGGTTCAA GCCCACTTCC 6201CCCAGGGCTC GCAGCCAACG TCGGGGCGGC AGGCCCTGCC ATAGCCTCAG GTTACTCATA TATACTTTAG ATTGATTTAA AACTTCATTT TTAATTTAAA GGGTCCCGAG CGTCGGTTGC AGCCCCGCCG TCCGGGACGG TATCGGAGTC CAATGAGTAT ATATGAAATC TAACTAAATT TTGAAGTAAA AATTAAATTT 6301AGGATCTAGG TGAAGATCCT TTTTGATAAT CTCATGACCA AAATCCCTTA ACGTGAGTTT TCGTTCCACT GAGCGTCAGA CCCCGTAGAA AAGATCAAAGTCCTAGATCC ACTTCTAGGA AAAACTATTA GAGTACTGGT TTTAGGGAAT TGCACTCAAA AGCAAGGTGA CTCGCAGTCT GGGGCATCTT TTCTAGTTTC6401GATCTTCTTG AGATCCTTTT TTTCTGCGCG TAATCTGCTG CTTGCAAACA AAAAAACCAC CGCTACCAGC GGTGGTTTGT TTGCCGGATC AAGAGCTACCCTAGAAGAAC TCTAGGAAAA AAAGACGCGC ATTAGACGAC GAACGTTTGT TTTTTTGGTG GCGATGGTCG CCACCAAACA AACGGCCTAG TTCTCGATGG6501AACTCTTTTT CCGAAGGTAA CTGGCTTCAG CAGAGCGCAG ATACCAAATA CTGTCCTTCT AGTGTAGCCG TAGTTAGGCC ACCACTTCAA GAACTCTGTATTGAGAAAAA GGCTTCCATT GACCGAAGTC GTCTCGCGTC TATGGTTTAT GACAGGAAGA TCACATCGGC ATCAATCCGG TGGTGAAGTT CTTGAGACAT6601GCACCGCCTA CATACCTCGC TCTGCTAATC CTGTTACCAG TGGCTGCTGC CAGTGGCGAT AAGTCGTGTC TTACCGGGTT GGACTCAAGA CGATAGTTACCGTGGCGGAT GTATGGAGCG AGACGATTAG GACAATGGTC ACCGACGAGC GTCACCGCTA TTCAGCACAG AATGGCCCAA CCTGAGTTCT GCTATCAATG6701CGGATAAGGC GCAGCGGTCG GGCTGAACGG GGGGTTCGTG CACACAGCCC AGCTTGGAGC GAACGACCTA CACCGAACTG AGATACCTAC AGCGTGAGCTGCCTATTCCG CGTCGCCAGC CCGACTTGCC CCCCAAGCAC GTGTGTCGGG TCGAACCTCG CTTGCTGGAT GTGGCTTGAC TCTATGGATG TCGCACTCGA6801ATGAGAAAGC GCCACGCTTC CCGAAGGGAG AAAGGCGGAC AGGTATCCGG TAAGCGGCAG GGTCGGAACA GGAGAGCGCA CGAGGGAGCT TCCAGGGGGATACTCTTTCG CGGTGCGAAG GGCTTCCCTC TTTCCGCCTG TCCATAGGCC ATTCGCCGTC CCAGCCTTGT CCTCTCGCGT GCTCCCTCGA AGGTCCCCCT6901AACGCCTGGT ATCTTTATAG TCCTGTCGGG TTTCGCCACC TCTGACTTGA GCGTCGATTT TTGTGATGCT CGTCAGGGGG GCGGAGCCTA TGGAAAAACGTTGCGGACCA TAGAAATATC AGGACAGCCC AAAGCGGTGG AGACTGAACT CGCAGCTAAA AACACTACGA GCAGTCCCCC CGCCTCGGAT ACCTTTTTGC7001CCAGCAACGC GGCCTTTTTA CGGTTCCTGG CCTTTTGCTG GCCTTTTGCT CACATGTTCT TTCCTGCGTT ATCCCCTGAT TCTGTGGATA ACCGTATTACGGTCGTTGCG CCGGAAAAAT GCCAAGGACC GGAAAACGAC CGGAAAACGA GTGTACAAGA AAGGACGCAA TAGGGGACTA AGACACCTAT TGGCATAATG7101CGCCATGCAT TAGTTATTAA TAGTAATCAA TTACGGGGTC ATTAGTTCAT AGCCCATATA TGGAGTTCCG CGTTACATAA CTTACGGTAA ATGGCCCGCCGCGGTACGTA ATCAATAATT ATCATTAGTT AATGCCCCAG TAATCAAGTA TCGGGTATAT ACCTCAAGGC GCAATGTATT GAATGCCATT TACCGGGCGG7201TGGCTGACCG CCCAACGACC CCCGCCCATT GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA GGGACTTTCC ATTGACGTCA ATGGGTGGAGACCGACTGGC GGGTTGCTGG GGGCGGGTAA CTGCAGTTAT TACTGCATAC AAGGGTATCA TTGCGGTTAT CCCTGAAAGG TAACTGCAGT TACCCACCTC7301TATTTACGGT AAACTGCCCA CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTACGCCC CCTATTGACG TCAATGACGG TAAATGGCCC GCCTGGCATTATAAATGCCA TTTGACGGGT GAACCGTCAT GTAGTTCACA TAGTATACGG TTCATGCGGG GGATAACTGC AGTTACTGCC ATTTACCGGG CGGACCGTAA7401ATGCCCAGTA CATGACCTTA TGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC CATGGTGATG CGGTTTTGGC AGTACATCAATACGGGTCAT GTACTGGAAT ACCCTGAAAG GATGAACCGT CATGTAGATG CATAATCAGT AGCGATAATG GTACCACTAC GCCAAAACCG TCATGTAGTT7501TGGGCGTGGA TAGCGGTTTG ACTCACGGGG ATTTCCAAGT CTCCACCCCA TTGACGTCAA TGGGAGTTTG TTTTGGCACC AAAATCAACG GGACTTTCCAACCCGCACCT ATCGCCAAAC TGAGTGCCCC TAAAGGTTCA GAGGTGGGGT AACTGCAGTT ACCCTCAAAC AAAACCGTGG TTTTAGTTGC CCTGAAAGGT7601AAATGTCGTA ACAACTCCGC CCCATTGACG CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA GCAGAGCTTTTACAGCAT TGTTGAGGCG GGGTAACTGC GTTTACCCGC CATCCGCACA TGCCACCCTC CAGATATATT CGTCTCGApVHentry-GFP1                               Esp3I                              ~~~~~~~ 1GGTTTAGTGA ACCGTCAGAT CCGCTAGACG TCTCATATAC CTGACTGGAA TACGACAGCT CCTGCAGCTT CTGGGCGAAG ACCACCGTGG CCCATTGCGT CCAAATCACT TGGCAGTCTA GGCGATCTGC AGAGTATATG GACTGACCTT ATGCTGTCGA GGACGTCGAA GACCCGCTTC TGGTGGCACC GGGTAACGCA 101ACTTAGCGAT AATCTGGTCC GCTTGGAAGT TAGCACGGCG AGCGCGCTCC AGAGCCAAGT CACGCAGCTT AACAGTACCT ACCGCAGAGC GGTGCATGAA TGAATCGCTA TTAGACCAGG CGAACCTTCA ATCGTGCCGC TCGCGCGAGG TCTCGGTTCA GTGCGTCGAA TTGTCATGGA TGGCGTCTCG CCACGTACTT 201CAGGCCGATA ACGTTGTCCT TAGCAACCTT GACATTACCC TCACCTTTAT TGGCAGGGAA GACGTGCTTC TGACCAGTAG TGCCCTCACG AGCGGTACCA GTCCGGCTAT TGCAACAGGA ATCGTTGGAA CTGTAATGGG AGTGGAAATA ACCGTCCCTT CTGCACGAAG ACTGGTCATC ACGGGAGTGC TCGCCATGGT 301GCACCACCAG CGGTGAGGTG CGGAACTTCT ACAACCTCAA AGCCCATAAC GTTGCGGATA GAACCCTTCT CAGGGTCAAT CAGAGCAGCG TAGTTTGCTG CGTGGTGGTC GCCACTCCAC GCCTTGAAGA TGTTGGAGTT TCGGGTATTG CAACGCCTAT CTTGGGAAGA GTCCCAGTTA GTCTCGTCGC ATCAAACGAC 401CGTTCGGCAT CAGTGCTGCC AGAATCGCAG AGTAGCTATC TGGGTCACAG TAGAACACAC GGTCAGCAGC CGGAACATAG TTCTTGGTCA GAGCCGCACG GCAAGCCGTA GTCACGACGG TCTTAGCGTC TCATCGATAG ACCCAGTGTC ATCTTGTGTG CCAGTCGTCG GCCTTGTATC AAGAACCAGT CTCGGCGTGC 501AGCCTTAGTC AGAGCCGCAA TAATCTCCTT ACCCAGCGCA ACTTGGTCGG TAAGTGCGGC CTTGTTCTGA GTGGTCTCAA TTACGGTAGC AGTACCTAAG TCGGAATCAG TCTCGGCGTT ATTAGAGGAA TGGGTCGCGT TGAACCAGCC ATTCACGCCG GAACAAGACT CACCAGAGTT AATGCCATCG TCATGGATTC 601CCCTCGATGT TCTCATTATA TTTGCTTTCC ACGTTACACA GACCGGCAAT CTCAGCCAGA ACCGCACCAT CCGCAGCCAT CGCCAGAGAT TCACCCAACT GGGAGCTACA AGAGTAATAT AAACGAAAGG TGCAATGTGT CTGGCCGTTA GAGTCGGTCT TGGCGTGGTA GGCGTCGGTA GCGGTCTCTA AGTGGGTTGA 701GAGAGGTATA CTCAGAGCGA ACGTCGTAGT GGTTCATCGC GTCCTCAATA TCATAAATCA GAACGTCAGC CGTCAGGAGA CCGTCAATGG TGATTACCTT CTCTCCATAT GAGTCTCGCT TGCAGCATCA CCAAGTAGCG CAGGAGTTAT AGTATTTAGT CTTGCAGTCG GCAGTCCTCT GGCAGTTACC ACTAATGGAA 801CTCGGTGTGT TTGATGTCCT TACGTTTATC GTCGAGGTTC TCGCCCGGAG CCAGATACGC TGCCTGAGTG CGACCCAGAA CAGGGAACTG AGCGGATTTA GAGCCACACA AACTACAGGA ATGCAAATAG CAGCTCCAAG AGCGGGCCTC GGTCTATGCG ACGGACTCAC GCTGGGTCTT GTCCCTTGAC TCGCCTAAAT 901CCGCTGGAGA TGGAACGTAC CATGTGGCGA GAAGTGGTCA CGGAGGTACG AGCGAACGCA GTCAGGACTT CACCGCCAAA TACCTTCAAG AACAACGCCA GGCGACCTCT ACCTTGCATG GTACACCGCT CTTCACCAGT GCCTCCATGC TCGCTTGCGT CAGTCCTGAA GTGGCGGTTT ATGGAAGTTC TTGTTGCGGT                                                                                                         Esp3I                                                                                                        ~~~~~1001GTTTATCTCC AGCAGCAACT ACACCTTTAC CTTGGTTAGT ACCCATTTGC TGTCCACCAG TCATGCTAGC CATATGTATA TCTCCTTCTT AAAGTCGTCT CAAATAGAGG TCGTCGTTGA TGTGGAAATG GAACCAATCA TGGGTAAACG ACAGGTGGTC AGTACGATCG GTATACATAT AGAGGAAGAA TTTCAGCAGA Esp3I ~ 1101CCAGTGCCTC CACCAAGGGC CCATCGGTCT TCCCCCTGGC GCCCTGCTCC AGGAGCACCT CCGAGAGCAC AGCGGCCCTG GGCTGCCTGG TCAAGGACTA GGTCACGGAG GTGGTTCCCG GGTAGCCAGA AGGGGGACCG CGGGACGAGG TCCTCGTGGA GGCTCTCGTG TCGCCGGGAC CCGACGGACC AGTTCCTGAT 1201CTTCCCCGAA CCGGTGACGG TGTCGTGGAA CTCAGGCGCT CTGACCAGCG GCGTGCACAC CTTCCCAGCT GTCCTACAGT CCTCAGGACT CTACTCCCTC GAAGGGGCTT GGCCACTGCC ACAGCACCTT GAGTCCGCGA GACTGGTCGC CGCACGTGTG GAAGGGTCGA CAGGATGTCA GGAGTCCTGA GATGAGGGAG 1301AGCAGCGTGG TGACCGTGCC CTCCAGCAGC TTGGGCACCC AGACCTACAT CTGCAACGTG AATCACAAGC CCAGCAACAC CAAGGTGGAC AAGAAAGTTG TCGTCGCACC ACTGGCACGG GAGGTCGTCG AACCCGTGGG TCTGGATGTA GACGTTGCAC TTAGTGTTCG GGTCGTTGTG GTTCCACCTG TTCTTTCAAC 1401AGCCCAAATC TTGTGACAAA ACTCACACAT GCCCACCGTG CCCAGCACCT GAACTCCTGG GGGGACCGTC AGTCTTCCTC TTCCCCCCMA AACCCAAGGA TCGGGTTTAG AACACTGTTT TGAGTGTGTA CGGGTGGCAC GGGTCGTGGA CTTGAGGACC CCCCTGGCAG TCAGAAGGAG AAGGGGGGKT TTGGGTTCCT 1501CACCCTCATG ATCTCCCGGA CCCCTGAGGT CACATGCGTG GTGGTGGACG TGAGCCACGA AGACCCTGAG GTCAAGTTCA ACTGGTACGT GGACGGCGTG GTGGGAGTAC TAGAGGGCCT GGGGACTCCA GTGTACGCAC CACCACCTGC ACTCGGTGCT TCTGGGACTC CAGTTCAAGT TGACCATGCA CCTGCCGCAC 1601GAGGTGCATA ATGCCAAGAC AAAGCCGCGG GAGGAGCAGT ACAACAGCAC GTACCGTGTG GTCAGCGTCC TCACCGTCCT GCACCAGGAC TGGCTGAATG CTCCACGTAT TACGGTTCTG TTTCGGCGCC CTCCTCGTCA TGTTGTCGTG CATGGCACAC CAGTCGCAGG AGTGGCAGGA CGTGGTCCTG ACCGACTTAC 1701GCAAGGAGTA CAAGTGCAAG GTCTCCAACA AAGCCCTCCC AGCCCCCATC GAGAAAACCA TCTCCAAAGC CAAAGGGCAG CCCCGAGAAC CACAGGTGTA CGTTCCTCAT GTTCACGTTC CAGAGGTTGT TTCGGGAGGG TCGGGGGTAG CTCTTTTGGT AGAGGTTTCG GTTTCCCGTC GGGGCTCTTG GTGTCCACAT 1801CACCCTGCCC CCATCCCGGG ATGAGCTGAC CAAGAACCAG GTCAGCCTGA CCTGCCTGGT CAAAGGCTTC TACCCCAGCG ACATCGCCGT GGAGTGGGAG GTGGGACGGG GGTAGGGCCC TACTCGACTG GTTCTTGGTC CAGTCGGACT GGACGGACCA GTTTCCGAAG ATGGGGTCGC TGTAGCGGCA CCTCACCCTC 1901AGCAATGGGC AGCCGGAGAA CAACTACAAG ACCACGCCTC CCATGCTGGA CTCCGACGGC TCCTTCTTCC TCTACAGCAA GCTCACCGTG GACAAGAGCA TCGTTACCCG TCGGCCTCTT GTTGATGTTC TGGTGCGGAG GGTACGACCT GAGGCTGCCG AGGAAGAAGG AGATGTCGTT CGAGTGGCAC CTGTTCTCGT 2001GGTGGCAGCA GGGGAACGTC TTCTCATGCT CCGTGATGCA TGAGGCTCTG CACAACCACT ACACGCAGAA GAGCCTCTCC CTGTCTCCGG GTAAAGGGAG CCACCGTCGT CCCCTTGCAG AAGAGTACGA GGCACTACGT ACTCCGAGAC GTGTTGGTGA TGTGCGTCTT CTCGGAGAGG GACAGAGGCC CATTTCCCTC 2101CTCGCCAGAT AAGTGGTCAG ATCCACCGGT CGCCACCATG GTGAGCAAGG GCGAGGAGCT GTTCACCGGG GTGGTGCCCA TCCTGGTCGA GCTGGACGGC GAGCGGTCTA TTCACCAGTC TAGGTGGCCA GCGGTGGTAC CACTCGTTCC CGCTCCTCGA CAAGTGGCCC CACCACGGGT AGGACCAGCT CGACCTGCCG 2201GACGTAAACG GCCACAAGTT CAGCGTGTCC GGCGAGGGCG AGGGCGATGC CACCTACGGC AAGCTGACCC TGAAGTTCAT CTGCACCACC GGCAAGCTGC CTGCATTTGC CGGTGTTCAA GTCGCACAGG CCGCTCCCGC TCCCGCTACG GTGGATGCCG TTCGACTGGG ACTTCAAGTA GACGTGGTGG CCGTTCGACG 2301CCGTGCCCTG GCCCACCCTC GTGACCACCC TGACCTACGG CGTGCAGTGC TTCAGCCGCT ACCCCGACCA CATGAAGCAG CACGACTTCT TCAAGTCCGC GGCACGGGAC CGGGTGGGAG CACTGGTGGG ACTGGATGCC GCACGTCACG AAGTCGGCGA TGGGGCTGGT GTACTTCGTC GTGCTGAAGA AGTTCAGGCG 2401CATGCCCGAA GGCTACGTCC AGGAGCGCAC CATCTTCTTC AAGGACGACG GCAACTACAA GACCCGCGCC GAGGTGAAGT TCGAGGGCGA CACCCTGGTG GTACGGGCTT CCGATGCAGG TCCTCGCGTG GTAGAAGAAG TTCCTGCTGC CGTTGATGTT CTGGGCGCGG CTCCACTTCA AGCTCCCGCT GTGGGACCAC 2501AACCGCATCG AGCTGAAGGG CATCGACTTC AAGGAGGACG GCAACATCCT GGGGCACAAG CTGGAGTACA ACTACAACAG CCACAACGTC TATATCATGGTTGGCGTAGC TCGACTTCCC GTAGCTGAAG TTCCTCCTGC CGTTGTAGGA CCCCGTGTTC GACCTCATGT TGATGTTGTC GGTGTTGCAG ATATAGTACC2601CCGACAAGCA GAAGAACGGC ATCAAGGTGA ACTTCAAGAT CCGCCACAAC ATCGAGGACG GCAGCGTGCA GCTCGCCGAC CACTACCAGC AGAACACCCCGGCTGTTCGT CTTCTTGCCG TAGTTCCACT TGAAGTTCTA GGCGGTGTTG TAGCTCCTGC CGTCGCACGT CGAGCGGCTG GTGATGGTCG TCTTGTGGGG2701CATCGGCGAC GGCCCCGTGC TGCTGCCCGA CAACCACTAC CTGAGCACCC AGTCCGCCCT GAGCAAAGAC CCCAACGAGA AGCGCGATCA CATGGTCCTGGTAGCCGCTG CCGGGGCACG ACGACGGGCT GTTGGTGATG GACTCGTGGG TCAGGCGGGA CTCGTTTCGT GGGTTGCTCT TCGCGCTAGT GTACCAGGAC2801CTGGAGTTCG TGACCGCCGC CGGGATCACT CTCGGCATGG ACGAGCTGTA CAAGTAAAGC GGCCGCAATT TAATTCCGGT TATTTTCCAC CATATTGCCGGACCTCAAGC ACTGGCGGCG GCCCTAGTGA GAGCCGTACC TGCTCGACAT GTTCATTTCG CCGGCGTTAA ATTAAGGCCA ATAAAAGGTG GTATAACGGC2901TCTTTTGGCA ATGTGAGGGC CCGGAAACCT GGCCCTGTCT TCTTGACGAG CATTCCTAGG GGTCTTTCCC CTCTCGCCAA AGGAATGCAA GGTCTGTTGAAGAAAACCGT TACACTCCCG GGCCTTTGGA CCGGGACAGA AGAACTGCTC GTAAGGATCC CCAGAAAGGG GAGAGCGGTT TCCTTACGTT CCAGACAACT3001ATGTCGTGAA GGAAGCAGTT CCTCTGGAAG CTTCTTGAAG ACAAACAACG TCTGTAGCGA CCCTTTGCAG GCAGCGGAAC CCCCCACCTG GCGACAGGTGTACAGCACTT CCTTCGTCAA GGAGACCTTC GAAGAACTTC TGTTTGTTGC AGACATCGCT GGGAAACGTC CGTCGCCTTG GGGGGTGGAC CGCTGTCCAC3101CCTCTGCGGC CAAAAGCCAC GTGTATAAGA TACACCTGCA AAGGCGGCAC AACCCCAGTG CCACGTTGTG AGTTGGATAG TTGTGGAAAG AGTCAAATGGGGAGACGCCG GTTTTCGGTG CACATATTCT ATGTGGACGT TTCCGCCGTG TTGGGGTCAC GGTGCAACAC TCAACCTATC AACACCTTTC TCAGTTTACC3201CTCACCTCAA GCGTATTCAA CAAGGGGCTG AAGGATGCCC AGAAGGTACC CCATTGTATG GGATCTGATC TGGGGCCTCG GTGCACATGC TTTACATGTGGAGTGGAGTT CGCATAAGTT GTTCCCCGAC TTCCTACGGG TCTTCCATGG GGTAACATAC CCTAGACTAG ACCCCGGAGC CACGTGTACG AAATGTACAC3301TTTAGTCGAG GTTAAAAAAC GTCTAGGCCC CCCGAACCAC GGGGACGTGG TTTTCCTTTG AAAAACACGA TGATAATATG GCCACCACCC ATACCTAGGCAAATCAGCTC CAATTTTTTG CAGATCCGGG GGGCTTGGTG CCCCTGCACC AAAAGGAAAC TTTTTGTGCT ACTATTATAC CGGTGGTGGG TATGGATCCG3401TTTTGCAAAG ATCGATCAAG AGACAGGATG AGGATCGTTT CGCATGATTG AACAAGATGG ATTGCACGCA GGTTCTCCGG CCGCTTGGGT GGAGAGGCTAAAAACGTTTC TAGCTAGTTC TCTGTCCTAC TCCTAGCAAA GCGTACTAAC TTGTTCTACC TAACGTGCGT CCAAGAGGCC GGCGAACCCA CCTCTCCGAT3501TTCGGCTATG ACTGGGCACA ACAGACAATC GGCTGCTCTG ATGCCGCCGT GTTCCGGCTG TCAGCGCAGG GGCGCCCGGT TCTTTTTGTC AAGACCGACCAAGCCGATAC TGACCCGTGT TGTCTGTTAG CCGACGAGAC TACGGCGGCA CAAGGCCGAC AGTCGCGTCC CCGCGGGCCA AGAAAAACAG TTCTGGCTGG3601TGTCCGGTGC CCTGAATGAA CTGCAAGACG AGGCAGCGCG GCTATCGTGG CTGGCCACGA CGGGCGTTCC TTGCGCAGCT GTGCTCGACG TTGTCACTGAACAGGCCACG GGACTTACTT GACTGGCTGC TCCGTCGCGC CGATAGCACC GACCGGTGCT GCCCGCAAGG AACGCGTCGA CACGAGCTGC AACAGTGACT3701AGCGGGAAGG GACTGGCTGC TATTGGGCGA AGTGCCGGGG CAGGATCTCC TGTCATCTCA CCTTGCTCCT GCCGAGAAAG TATCCATCAT GGCTGATGCA TCGCCCTTCC CTGACCGACG ATAACCCGCT TCACGGCCCC GTCCTAGAGG ACAGTAGAGT GGAACGAGGA CGGCTCTTTC ATAGGTAGTA CCGACTACGT 3801ATGCGGCGGC TGCATACGCT TGATCCGGCT ACCTGCCCAT TCGACCACCA AGCGAAACAT CGCATCGAGC GAGCACGTAC TCGGATGGAA GCCGGTCTTG TACGCCGCCG ACGTATGCGA ACTAGGCCGA TGGACGGGTA AGCTGGTGGT TCGCTTTGTA GCGTAGCTCG CTCGTGCATG AGCCTACCTT CGGCCAGAAC 3901TCGATCAGGA TGATCTGGAC GAAGAGCATC AGGGGCTCGC GCCAGCCGAA CTGTTCGCCA GGCTCAAGGC GAGCATGCCC GACGGCGAGG ATCTCGTCGT AGCTAGTCCT ACTAGACCTG CTTCTCGTAG TCCCCGAGCG CGGTCGGCTT GACAAGCGGT CCGAGTTCCG CTCGTACGGG CTGCCGCTCC TAGAGCAGCA 4001GACCCATGGC GATGCCTGCT TGCCGAATAT CATGGTGGAA AATGGCCGCT TTTCTGGATT CATCGACTGT GGCCGGCTGG GTGTGGCGGA CCGCTATCAG CTGGGTACCG CTACGGACGA ACGGCTTATA GTACCACCTT TTACCGGCGA AAAGACCTAA GTAGCTGACA CCGGCCGACC CACACCGCCT GGCGATAGTC 4101GACATAGCGT TGGCTACCCG TGATATTGCT GAAGAGCTTG GCGGCGAATG GGCTGACCGC TTCCTCGTGC TTTACGGTAT CGCCGCTCCC GATTCGCAGC CTGTATCGCA ACCGATGGGC ACTATAACGA CTTCTCGAAC CGCCGCTTAC CCGACTGGCG AAGGAGCACG AAATGCCATA GCGGCGAGGG CTAAGCGTCG 4201GCATCGCCTT CTATCGCCTT CTTGACGAGT TCTTCTGAGC GGGACTCTGG GGTTCGGGCC GCACTCGAGC ATAAACTTGT TTATTGCAGC TTATAATGGT CGTAGCGGAA GATAGCGGAA GAACTGCTCA AGAAGACTCG CCCTGAGACC CCAAGCCCGG CGTGAGCTCG TATTTGAACA AATAACGTCG AATATTACCA                                                                                                           I-SceI                                                                                                          ~~~4301TACAAATAAA GCAATAGCAT CACAAATTTC ACAAATAAAG CATTTTTTTC ACTGCATTCT AGTTGTGGTT TGTCCAAACT CATCAATGTA TCTTAAGTAG ATGTTTATTT CGTTATCGTA GTGTTTAAAG TGTTTATTTC GTAAAAAAAG TGACGTAAGA TCAACACCAA ACAGGTTTGA GTAGTTACAT AGAATTCATC     I-SceI ~~~~~~~~~~~~~~~~ 4401GGATAACAGG GTAATTTTGT TAAATCAGCT CATTTTTTAA CCAATAGGAA CGCCATCAAA AATAATTCGC GTCTGGCCTT CCTGTAGCCA GCTTTCATCA CCTATTGTCC CATTAAAACA ATTTAGTCGA GTAAAAAATT GGTTATCCTT GCGGTAGTTT TTATTAAGCG CAGACCGGAA GGACATCGGT CGAAAGTAGT 4501ACATTAAATG TGAGCGAGTA ACAACCCGTC GGATTCTCCG TGGGAACAAA CGGCGGATTG ACCGTAATGG GATAGGTTAC GTTGGTGTAG ATGGGCGCAT TGTAATTTAC ACTCGCTCAT TGTTGGGCAG CCTAAGAGGC ACCCTTGTTT GCCGCCTAAC TGGCATTACC CTATCCAATG CAACCACATC TACCCGCGTA 4601CGTAACCGTG CATCTGCCAG TTTGAGGGGA CGACGACCGT ATCGGCCTCA GGAAGATCGC ACTCCAGCCA GCTTTCCGGC ACCGCTTCTG GTGCCGGAAA GCATTGGCAC GTAGACGGTC AAACTCCCCT GCTGCTGGCA TAGCCGGAGT CCTTCTAGCG TGAGGTCGGT CGAAAGGCCG TGGCGAAGAC CACGGCCTTT 4701CCAGGCAAAG CGCCATTCGC CATTCAGGCT GCGCAACTGT TGGGAAGGGC GATCGGTGCG GGCCTCTTCG CTATTACGCC AGCTGGCGAA AGGGGGATGT GGTCCGTTTC GCGGTAAGCG GTAAGTCCGA CGCGTTGACA ACCCTTCCCG CTAGCCACGC CCGGAGAAGC GATAATGCGG TCGACCGCTT TCCCCCTACA 4801GCTGCAAGGC GATTAAGTTG GGTAACGCCA GGGTTTTCCC AGTCACGACG TTGTAAAACG ACGGCCAGTG AATTGCAATT CGTAATCATG GTCATAGCTG CGACGTTCCG CTAATTCAAC CCATTGCGGT CCCAAAAGGG TCAGTGCTGC AACATTTTGC TGCCGGTCAC TTAACGTTAA GCATTAGTAC CAGTATCGAC 4901TTTCCTGTGT GAAATTGTTA TCCGCTCACA ATTCCACACA ACATACGAGC CGGAAGCATA AAGTGTAAAG CCTGGGGTGC CTAATGAGTG AGCTAACTCA AAAGGACACA CTTTAACAAT AGGCGAGTGT TAAGGTGTGT TGTATGCTCG GCCTTCGTAT TTCACATTTC GGACCCCACG GATTACTCAC TCGATTGAGT                                  I-SceI                           ~~~~~~~~~~~~~~~~~~~~ 5001CATTAATTGC GTTGCGCTCA CTGCCATTAC CCTGTTATCC CTAGTGAACC ATCACCCTAA TCAAGTTTTT TGGGGTCGAG GTGCCGTAAA GCACTAAATC GTAATTAACG CAACGCGAGT GACGGTAATG GGACAATAGG GATCACTTGG TAGTGGGATT AGTTCAAAAA ACCCCAGCTC CACGGCATTT CGTGATTTAG 5101GGAACCCTAA AGGGAGCCCC CGATTTAGAG CTTGACGGGG AAAGCCGGCG AACGTGGCGA GAAAGGAAGG GAAGAAAGCG AAAGGAGCGG GCGCTAGGGC CCTTGGGATT TCCCTCGGGG GCTAAATCTC GAACTGCCCC TTTCGGCCGC TTGCACCGCT CTTTCCTTCC CTTCTTTCGC TTTCCTCGCC CGCGATCCCG 5201GCTGGCAAGT GTAGCGGTCA CGCTGCGCGT AACCACCACA CCCGCCGCGC TTAATGCGCC GCTACAGGGC GCGTCAGGTG GCACTTTTCG GGGAAATGTG CGACCGTTCA CATCGCCAGT GCGACGCGCA TTGGTGGTGT GGGCGGCGCG AATTACGCGG CGATGTCCCG CGCAGTCCAC CGTGAAAAGC CCCTTTACAC 5301CGCGGAACCC CTATTTGTTT ATTTTTCTAA ATACATTCAA ATATGTATCC GCTCATGAGA CAATAACCCT GATAAATGCT TCAATAATAA CGACCGGTAA GCGCCTTGGG GATAAACAAA TAAAAAGATT TATGTAAGTT TATACATAGG CGAGTACTCT GTTATTGGGA CTATTTACGA AGTTATTATT GCTGGCCATT 5401TGAAAAAGGA AGAGTATGAG TATTCAACAT TTCCGTGTCG CCCTTATTCC CTTTTTTGCG GCATTTTGCC TTCCTGTTTT TGCTCACCCA GAAACGCTGG ACTTTTTCCT TCTCATACTC ATAAGTTGTA AAGGCACAGC GGGAATAAGG GAAAAAACGC CGTAAAACGG AAGGACAAAA ACGAGTGGGT CTTTGCGACC 5501TGAAAGTAAA AGATGCTGAA GATCAGTTGG GTGCACGAGT GGGTTACATC GAACTGGATC TCAACAGCGG TAAGATCCTT GAGAGTTTTC GCCCCGAAGA ACTTTCATTT TCTACGACTT CTAGTCAACC CACGTGCTCA CCCAATGTAG CTTGACCTAG AGTTGTCGCC ATTCTAGGAA CTCTCAAAAG CGGGGCTTCT 5601ACGTTTTCCA ATGATGAGCA CTTTTAAAGT TCTGCTATGT GGCGCGGTAT TATCCCGTAT TGACGCCGGG CAAGAGCAAC TCGGTCGCCG CATACACTAT TGCAAAAGGT TACTACTCGT GAAAATTTCA AGACGATACA CCGCGCCATA ATAGGGCATA ACTGCGGCCC GTTCTCGTTG AGCCAGCGGC GTATGTGATA 5701TCTCAGAATG ACTTGGTTGA GTCTAGCGTT GATCGGCACG TAAGAGGTTC CAACTTTCAC CATAATGAAA TAAGATCACT ACCGGGCGTA TTTTTTGAGT AGAGTCTTAC TGAACCAACT CAGATCGCAA CTAGCCGTGC ATTCTCCAAG GTTGAAAGTG GTATTACTTT ATTCTAGTGA TGGCCCGCAT AAAAAACTCA 5801TATCGAGATT TTCAGGAGCT AAGGAAGCTA AAATGGAGAA AAAAATCACT GGATATACCA CCGTTGATAT ATCCCAATGG CATCGTAAAG AACATTTTGA ATAGCTCTAA AAGTCCTCGA TTCCTTCGAT TTTACCTCTT TTTTTAGTGA CCTATATGGT GGCAACTATA TAGGGTTACC GTAGCATTTC TTGTAAAACT 5901GGCATTTCAG TCAGTTGCTC AATGTACCTA TAACCAGACC GTTCAGCTGG ATATTACGGC CTTTTTAAAG ACCGTAAAGA AAAATAAGCA CAAGTTTTAT CCGTAAAGTC AGTCAACGAG TTACATGGAT ATTGGTCTGG CAAGTCGACC TATAATGCCG GAAAAATTTC TGGCATTTCT TTTTATTCGT GTTCAAAATA 6001CCGGCCTTTA TTCACATTCT TGCCCGCCTG ATGAATGCTC ATCCGGAATT CCGTATGGCA ATGAAAGACG GTGAGCTGGT GATATGGGAT AGTGTTCACC GGCCGGAAAT AAGTGTAAGA ACGGGCGGAC TACTTACGAG TAGGCCTTAA GGCATACCGT TACTTTCTGC CACTCGACCA CTATACCCTA TCACAAGTGG 6101CTTGTTACAC CGTTTTCCAT GAGCAAACTG AAACGTTTTC ATCGCTCTGG AGTGAATACC ACGACGATTT CCGGCAGTTT CTACACATAT ATTCGCAAGA GAACAATGTG GCAAAAGGTA CTCGTTTGAC TTTGCAAAAG TAGCGAGACC TCACTTATGG TGCTGCTAAA GGCCGTCAAA GATGTGTATA TAAGCGTTCT6201TGTGGCGTGT TACGGTGAAA ACCTGGCCTA TTTCCCTAAA GGGTTTATTG AGAATATGTT TTTCGTATCA GCCAATCCCT GGGTGAGTTT CACCAGTTTTACACCGCACA ATGCCACTTT TGGACCGGAT AAAGGGATTT CCCAAATAAC TCTTATACAA AAAGCATAGT CGGTTAGGGA CCCACTCAAA GTGGTCAAAA6301GATTTAAACG TGGCCAATAT GGACAACTTC TTCGCCCCCG TTTTCACCAT GGGCAAATAT TATACGCAAG GCGACAAGGT GCTGATGCCG CTGGCGATTCCTAAATTTGC ACCGGTTATA CCTGTTGAAG AAGCGGGGGC AAAAGTGGTA CCCGTTTATA ATATGCGTTC CGCTGTTCCA CGACTACGGC GACCGCTAAG6401AGGTTCATCA TGCCGTCTGT GATGGCTTCC ATGTCGGCAG AATGCTTAAT GAATTACAAC AGTACTGCGA TGAGTGGCAG GGCGGGGCGT AATTTTTTTATCCAAGTAGT ACGGCAGACA CTACCGAAGG TACAGCCGTC TTACGAATTA CTTAATGTTG TCATGACGCT ACTCACCGTC CCGCCCCGCA TTAAAAAAAT6501AGGCAGTTAT TGGTGCCCTT AAACGCCTGG TGCTACGCCT GAATAAGTGA TAATAAGCGG ATGAATGGCA GAAATTCGAA ATGACCGACC AAGCGACGCCTCCGTCAATA ACCACGGGAA TTTGCGGACC ACGATGCGGA CTTATTCACT ATTATTCGCC TACTTACCGT CTTTAAGCTT TACTGGCTGG TTCGCTGCGG6601CAACCTGCCA TCACGAGATT TCGATTCCAC CGCCGCCTTC TATGAAAGGT TGGGCTTCGG AATCGTTTTC CGGGACGCCG GCTGGATGAT CCTCCAGCGCGTTGGACGGT AGTGCTCTAA AGCTAAGGTG GCGGCGGAAG ATACTTTCCA ACCCGAAGCC TTAGCAAAAG GCCCTGCGGC CGACCTACTA GGAGGTCGCG6701GGGGATCTCA TGCTGGAGTT CTTCGCCCAC CCTAGGGGGA GGCTAACTGA AACACGGAAG GAGACAATAC CGGAAGGAAC CCGCGCTATG ACGGCAATAACCCCTAGAGT ACGACCTCAA GAAGCGGGTG GGATCCCCCT CCGATTGACT TTGTGCCTTC CTCTGTTATG GCCTTCCTTG GGCGCGATAC TGCCGTTATT6801AAAGACAGAA TAAAACGCAC GGTGTTGGGT CGTTTGTTCA TAAACGCGGG GTTCGGTCCC AGGGCTGGCA CTCTGTCGAT ACCCCACCGA GACCCCATTGTTTCTGTCTT ATTTTGCGTG CCACAACCCA GCAAACAAGT ATTTGCGCCC CAAGCCAGGG TCCCGACCGT GAGACAGCTA TGGGGTGGCT CTGGGGTAAC6901GGGCCAATAC GCCCGCGTTT CTTCCTTTTC CCCACCCCAC CCCCCAAGTT CGGGTGAAGG CCCAGGGCTC GCAGCCAACG TCGGGGCGGC AGGCCCTGCCCCCGGTTATG CGGGCGCAAA GAAGGAAAAG GGGTGGGGTG GGGGGTTCAA GCCCACTTCC GGGTCCCGAG CGTCGGTTGC AGCCCCGCCG TCCGGGACGG7001ATAGCCTCAG GTTACTCATA TATACTTTAG ATTGATTTAA AACTTCATTT TTAATTTAAA AGGATCTAGG TGAAGATCCT TTTTGATAAT CTCATGACCATATCGGAGTC CAATGAGTAT ATATGAAATC TAACTAAATT TTGAAGTAAA AATTAAATTT TCCTAGATCC ACTTCTAGGA AAAACTATTA GAGTACTGGT7101AAATCCCTTA ACGTGAGTTT TCGTTCCACT GAGCGTCAGA CCCCGTAGAA AAGATCAAAG GATCTTCTTG AGATCCTTTT TTTCTGCGCG TAATCTGCTGTTTAGGGAAT TGCACTCAAA AGCAAGGTGA CTCGCAGTCT GGGGCATCTT TTCTAGTTTC CTAGAAGAAC TCTAGGAAAA AAAGACGCGC ATTAGACGAC7201CTTGCAAACA AAAAAACCAC CGCTACCAGC GGTGGTTTGT TTGCCGGATC AAGAGCTACC AACTCTTTTT CCGAAGGTAA CTGGCTTCAG CAGAGCGCAGGAACGTTTGT TTTTTTGGTG GCGATGGTCG CCACCAAACA AACGGCCTAG TTCTCGATGG TTGAGAAAAA GGCTTCCATT GACCGAAGTC GTCTCGCGTC7301ATACCAAATA CTGTCCTTCT AGTGTAGCCG TAGTTAGGCC ACCACTTCAA GAACTCTGTA GCACCGCCTA CATACCTCGC TCTGCTAATC CTGTTACCAGTATGGTTTAT GACAGGAAGA TCACATCGGC ATCAATCCGG TGGTGAAGTT CTTGAGACAT CGTGGCGGAT GTATGGAGCG AGACGATTAG GACAATGGTC7401TGGCTGCTGC CAGTGGCGAT AAGTCGTGTC TTACCGGGTT GGACTCAAGA CGATAGTTAC CGGATAAGGC GCAGCGGTCG GGCTGAACGG GGGGTTCGTG ACCGACGACG GTCACCGCTA TTCAGCACAG AATGGCCCAA CCTGAGTTCT GCTATCAATG GCCTATTCCG CGTCGCCAGC CCGACTTGCC CCCCAAGCAC 7501CACACAGCCC AGCTTGGAGC GAACGACCTA CACCGAACTG AGATACCTAC AGCGTGAGCT ATGAGAAAGC GCCACGCTTC CCGAAGGGAG AAAGGCGGAC GTGTGTCGGG TCGAACCTCG CTTGCTGGAT GTGGCTTGAC TCTATGGATG TCGCACTCGA TACTCTTTCG CGGTGCGAAG GGCTTCCCTC TTTCCGCCTG 7601AGGTATCCGG TAAGCGGCAG GGTCGGAACA GGAGAGCGCA CGAGGGAGCT TCCAGGGGGA AACGCCTGGT ATCTTTATAG TCCTGTCGGG TTTCGCCACC TCCATAGGCC ATTCGCCGTC CCAGCCTTGT CCTCTCGCGT GCTCCCTCGA AGGTCCCCCT TTGCGGACCA TAGAAATATC AGGACAGCCC AAAGCGGTGG 7701TCTGACTTGA GCGTCGATTT TTGTGATGCT CGTCAGGGGG GCGGAGCCTA TGGAAAAACG CCAGCAACGC GGCCTTTTTA CGGTTCCTGG CCTTTTGCTG AGACTGAACT CGCAGCTAAA AACACTACGA GCAGTCCCCC CGCCTCGGAT ACCTTTTTGC GGTCGTTGCG CCGGAAAAAT GCCAAGGACC GGAAAACGAC 7801GCCTTTTGCT CACATGTTCT TTCCTGCGTT ATCCCCTGAT TCTGTGGATA ACCGTATTAC CGCCATGCAT TAGTTATTAA TAGTAATCAA TTACGGGGTC CGGAAAACGA GTGTACAAGA AAGGACGCAA TAGGGGACTA AGACACCTAT TGGCATAATG GCGGTACGTA ATCAATAATT ATCATTAGTT AATGCCCCAG 7901ATTAGTTCAT AGCCCATATA TGGAGTTCCG CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC CCCGCCCATT GACGTCAATA TAATCAAGTA TCGGGTATAT ACCTCAAGGC GCAATGTATT GAATGCCATT TACCGGGCGG ACCGACTGGC GGGTTGCTGG GGGCGGGTAA CTGCAGTTAT 8001ATGACGTATG TTCCCATAGT AACGCCAATA GGGACTTTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT AAACTGCCCA CTTGGCAGTA CATCAAGTGT TACTGCATAC AAGGGTATCA TTGCGGTTAT CCCTGAAAGG TAACTGCAGT TACCCACCTC ATAAATGCCA TTTGACGGGT GAACCGTCAT GTAGTTCACA 8101ATCATATGCC AAGTACGCCC CCTATTGACG TCAATGACGG TAAATGGCCC GCCTGGCATT ATGCCCAGTA CATGACCTTA TGGGACTTTC CTACTTGGCA TAGTATACGG TTCATGCGGG GGATAACTGC AGTTACTGCC ATTTACCGGG CGGACCGTAA TACGGGTCAT GTACTGGAAT ACCCTGAAAG GATGAACCGT 8201GTACATCTAC GTATTAGTCA TCGCTATTAC CATGGTGATG CGGTTTTGGC AGTACATCAA TGGGCGTGGA TAGCGGTTTG ACTCACGGGG ATTTCCAAGT CATGTAGATG CATAATCAGT AGCGATAATG GTACCACTAC GCCAAAACCG TCATGTAGTT ACCCGCACCT ATCGCCAAAC TGAGTGCCCC TAAAGGTTCA 8301CTCCACCCCA TTGACGTCAA TGGGAGTTTG TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCGTA ACAACTCCGC CCCATTGACG CAAATGGGCG GAGGTGGGGT AACTGCAGTT ACCCTCAAAC AAAACCGTGG TTTTAGTTGC CCTGAAAGGT TTTACAGCAT TGTTGAGGCG GGGTAACTGC GTTTACCCGC 8401 GTAGGCGTGT ACGGTGGGAG GTCTATATAA GCAGAGCT CATCCGCACA TGCCACCCTC CAGATATATT CGTCTCGA  pVHentry-MLuc7                             Esp3I                               ~~~~~~~1GGTTTAGTGA ACCGTCAGAT CCGCTAGACG TCTCATATAC CTGACTGGAA TACGACAGCT CCTGCAGCTT CTGGGCGAAG ACCACCGTGG CCCATTGCGT CCAAATCACT TGGCAGTCTA GGCGATCTGC AGAGTATATG GACTGACCTT ATGCTGTCGA GGACGTCGAA GACCCGCTTC TGGTGGCACC GGGTAACGCA 101ACTTAGCGAT AATCTGGTCC GCTTGGAAGT TAGCACGGCG AGCGCGCTCC AGAGCCAAGT CACGCAGCTT AACAGTACCT ACCGCAGAGC GGTGCATGAA TGAATCGCTA TTAGACCAGG CGAACCTTCA ATCGTGCCGC TCGCGCGAGG TCTCGGTTCA GTGCGTCGAA TTGTCATGGA TGGCGTCTCG CCACGTACTT 201CAGGCCGATA ACGTTGTCCT TAGCAACCTT GACATTACCC TCACCTTTAT TGGCAGGGAA GACGTGCTTC TGACCAGTAG TGCCCTCACG AGCGGTACCA GTCCGGCTAT TGCAACAGGA ATCGTTGGAA CTGTAATGGG AGTGGAAATA ACCGTCCCTT CTGCACGAAG ACTGGTCATC ACGGGAGTGC TCGCCATGGT 301GCACCACCAG CGGTGAGGTG CGGAACTTCT ACAACCTCAA AGCCCATAAC GTTGCGGATA GAACCCTTCT CAGGGTCAAT CAGAGCAGCG TAGTTTGCTG CGTGGTGGTC GCCACTCCAC GCCTTGAAGA TGTTGGAGTT TCGGGTATTG CAACGCCTAT CTTGGGAAGA GTCCCAGTTA GTCTCGTCGC ATCAAACGAC 401CGTTCGGCAT CAGTGCTGCC AGAATCGCAG AGTAGCTATC TGGGTCACAG TAGAACACAC GGTCAGCAGC CGGAACATAG TTCTTGGTCA GAGCCGCACG GCAAGCCGTA GTCACGACGG TCTTAGCGTC TCATCGATAG ACCCAGTGTC ATCTTGTGTG CCAGTCGTCG GCCTTGTATC AAGAACCAGT CTCGGCGTGC 501AGCCTTAGTC AGAGCCGCAA TAATCTCCTT ACCCAGCGCA ACTTGGTCGG TAAGTGCGGC CTTGTTCTGA GTGGTCTCAA TTACGGTAGC AGTACCTAAG TCGGAATCAG TCTCGGCGTT ATTAGAGGAA TGGGTCGCGT TGAACCAGCC ATTCACGCCG GAACAAGACT CACCAGAGTT AATGCCATCG TCATGGATTC 601CCCTCGATGT TCTCATTATA TTTGCTTTCC ACGTTACACA GACCGGCAAT CTCAGCCAGA ACCGCACCAT CCGCAGCCAT CGCCAGAGAT TCACCCAACT GGGAGCTACA AGAGTAATAT AAACGAAAGG TGCAATGTGT CTGGCCGTTA GAGTCGGTCT TGGCGTGGTA GGCGTCGGTA GCGGTCTCTA AGTGGGTTGA 701GAGAGGTATA CTCAGAGCGA ACGTCGTAGT GGTTCATCGC GTCCTCAATA TCATAAATCA GAACGTCAGC CGTCAGGAGA CCGTCAATGG TGATTACCTT CTCTCCATAT GAGTCTCGCT TGCAGCATCA CCAAGTAGCG CAGGAGTTAT AGTATTTAGT CTTGCAGTCG GCAGTCCTCT GGCAGTTACC ACTAATGGAA 801CTCGGTGTGT TTGATGTCCT TACGTTTATC GTCGAGGTTC TCGCCCGGAG CCAGATACGC TGCCTGAGTG CGACCCAGAA CAGGGAACTG AGCGGATTTA GAGCCACACA AACTACAGGA ATGCAAATAG CAGCTCCAAG AGCGGGCCTC GGTCTATGCG ACGGACTCAC GCTGGGTCTT GTCCCTTGAC TCGCCTAAAT 901CCGCTGGAGA TGGAACGTAC CATGTGGCGA GAAGTGGTCA CGGAGGTACG AGCGAACGCA GTCAGGACTT CACCGCCAAA TACCTTCAAG AACAACGCCA GGCGACCTCT ACCTTGCATG GTACACCGCT CTTCACCAGT GCCTCCATGC TCGCTTGCGT CAGTCCTGAA GTGGCGGTTT ATGGAAGTTC TTGTTGCGGT                                                                                                         Esp3I                                                                                                        ~~~~~1001GTTTATCTCC AGCAGCAACT ACACCTTTAC CTTGGTTAGT ACCCATTTGC TGTCCACCAG TCATGCTAGC CATATGTATA TCTCCTTCTT AAAGTCGTCT CAAATAGAGG TCGTCGTTGA TGTGGAAATG GAACCAATCA TGGGTAAACG ACAGGTGGTC AGTACGATCG GTATACATAT AGAGGAAGAA TTTCAGCAGA Esp3I ~ 1101CCAGTGCCTC CACCAAGGGC CCATCGGTCT TCCCCCTGGC GCCCTGCTCC AGGAGCACCT CCGAGAGCAC AGCGGCCCTG GGCTGCCTGG TCAAGGACTA GGTCACGGAG GTGGTTCCCG GGTAGCCAGA AGGGGGACCG CGGGACGAGG TCCTCGTGGA GGCTCTCGTG TCGCCGGGAC CCGACGGACC AGTTCCTGAT 1201CTTCCCCGAA CCGGTGACGG TGTCGTGGAA CTCAGGCGCT CTGACCAGCG GCGTGCACAC CTTCCCAGCT GTCCTACAGT CCTCAGGACT CTACTCCCTC GAAGGGGCTT GGCCACTGCC ACAGCACCTT GAGTCCGCGA GACTGGTCGC CGCACGTGTG GAAGGGTCGA CAGGATGTCA GGAGTCCTGA GATGAGGGAG 1301AGCAGCGTGG TGACCGTGCC CTCCAGCAGC TTGGGCACCC AGACCTACAT CTGCAACGTG AATCACAAGC CCAGCAACAC CAAGGTGGAC AAGAAAGTTG TCGTCGCACC ACTGGCACGG GAGGTCGTCG AACCCGTGGG TCTGGATGTA GACGTTGCAC TTAGTGTTCG GGTCGTTGTG GTTCCACCTG TTCTTTCAAC 1401AGCCCAAATC TTGTGACAAA ACTCACACAT GCCCACCGTG CCCAGCACCT GAACTCCTGG GGGGACCGTC AGTCTTCCTC TTCCCCCCMA AACCCAAGGATCGGGTTTAG AACACTGTTT TGAGTGTGTA CGGGTGGCAC GGGTCGTGGA CTTGAGGACC CCCCTGGCAG TCAGAAGGAG AAGGGGGGKT TTGGGTTCCT1501CACCCTCATG ATCTCCCGGA CCCCTGAGGT CACATGCGTG GTGGTGGACG TGAGCCACGA AGACCCTGAG GTCAAGTTCA ACTGGTACGT GGACGGCGTGGTGGGAGTAC TAGAGGGCCT GGGGACTCCA GTGTACGCAC CACCACCTGC ACTCGGTGCT TCTGGGACTC CAGTTCAAGT TGACCATGCA CCTGCCGCAC1601GAGGTGCATA ATGCCAAGAC AAAGCCGCGG GAGGAGCAGT ACAACAGCAC GTACCGTGTG GTCAGCGTCC TCACCGTCCT GCACCAGGAC TGGCTGAATGCTCCACGTAT TACGGTTCTG TTTCGGCGCC CTCCTCGTCA TGTTGTCGTG CATGGCACAC CAGTCGCAGG AGTGGCAGGA CGTGGTCCTG ACCGACTTAC1701GCAAGGAGTA CAAGTGCAAG GTCTCCAACA AAGCCCTCCC AGCCCCCATC GAGAAAACCA TCTCCAAAGC CAAAGGGCAG CCCCGAGAAC CACAGGTGTACGTTCCTCAT GTTCACGTTC CAGAGGTTGT TTCGGGAGGG TCGGGGGTAG CTCTTTTGGT AGAGGTTTCG GTTTCCCGTC GGGGCTCTTG GTGTCCACAT1801CACCCTGCCC CCATCCCGGG ATGAGCTGAC CAAGAACCAG GTCAGCCTGA CCTGCCTGGT CAAAGGCTTC TACCCCAGCG ACATCGCCGT GGAGTGGGAGGTGGGACGGG GGTAGGGCCC TACTCGACTG GTTCTTGGTC CAGTCGGACT GGACGGACCA GTTTCCGAAG ATGGGGTCGC TGTAGCGGCA CCTCACCCTC1901AGCAATGGGC AGCCGGAGAA CAACTACAAG ACCACGCCTC CCATGCTGGA CTCCGACGGC TCCTTCTTCC TCTACAGCAA GCTCACCGTG GACAAGAGCATCGTTACCCG TCGGCCTCTT GTTGATGTTC TGGTGCGGAG GGTACGACCT GAGGCTGCCG AGGAAGAAGG AGATGTCGTT CGAGTGGCAC CTGTTCTCGT2001GGTGGCAGCA GGGGAACGTC TTCTCATGCT CCGTGATGCA TGAGGCTCTG CACAACCACT ACACGCAGAA GAGCCTCTCC CTGTCTCCGG GTAAAGGGTACCACCGTCGT CCCCTTGCAG AAGAGTACGA GGCACTACGT ACTCCGAGAC GTGTTGGTGA TGTGCGTCTT CTCGGAGAGG GACAGAGGCC CATTTCCCAT2101CATGTCCCAT ATGCTCGACA TGGCAAGCAG CCTGAGACAG ATTCTGGACT CCCAGAAAAT GGAGTGGAGG TCCAACGCCG GGGGCAGCGG TAGGGATAAGGTACAGGGTA TACGAGCTGT ACCGTTCGTC GGACTCTGTC TAAGACCTGA GGGTCTTTTA CCTCACCTCC AGGTTGCGGC CCCCGTCGCC ATCCCTATTC2201TGGTCAGATC TTCGCGACAA TTCCAAATCA ACTGAGTTCG ATCCTAACAT TGACATTGTT GGTTTAGAAG GAAAATTTGG TATTACAAAC CTAGAAACGGACCAGTCTAG AAGCGCTGTT AAGGTTTAGT TGACTCAAGC TAGGATTGTA ACTGTAACAA CCAAATCTTC CTTTTAAACC ATAATGTTTG GATCTTTGCC2301ATTTATTCAC AATCTGGGAG ACAATGGAGG TCATGATCAA AGCAGATATT GCAGATACTG ATAGAGCCAG CAACTTTGTT GCAACTGAAA CCGATGCTAATAAATAAGTG TTAGACCCTC TGTTACCTCC AGTACTAGTT TCGTCTATAA CGTCTATGAC TATCTCGGTC GTTGAAACAA CGTTGACTTT GGCTACGATT2401CCGCGGAAAA ATGCCTGGCA AAAAACTGCC ACTGGCAGTT ATCATGGAAA TGGAAGCCAA TGCTTTCAAA GCTGGCTGCA CCAGGGGATG CCTTATCTGTGGCGCCTTTT TACGGACCGT TTTTTGACGG TGACCGTCAA TAGTACCTTT ACCTTCGGTT ACGAAAGTTT CGACCGACGT GGTCCCCTAC GGAATAGACA2501CTTTCAAAAA TTAAGTGTAC AGCCAAAATG AAGGTATACA TTCCAGGAAG GTGTCACGAT TATGGTGGTG ACAAGAAAAC TGGACAGGCA GGAATTGTTG GAAAGTTTTT AATTCACATG TCGGTTTTAC TTCCATATGT AAGGTCCTTC CACAGTGCTA ATACCACCAC TGTTCTTTTG ACCTGTCCGT CCTTAACAAC 2601GTGCAATTGT TGACATTCCC GAAATCTCTG GATTTAAGGA GATGGCACCC ATGGAACAGT TCATTGCTCA AGTTGATCGC TGCGCTTCCT GCACTACTGG CACGTTAACA ACTGTAAGGG CTTTAGAGAC CTAAATTCCT CTACCGTGGG TACCTTGTCA AGTAACGAGT TCAACTAGCG ACGCGAAGGA CGTGATGACC 2701ATGTCTCAAA GGTCTTGCCA ATGTTAAGTG CTCTGAACTC CTGAAGAAAT GGCTGCCTGA CAGGTGTGCA AGTTTTGCTG ACAAGATTCA AAAAGAAGTT TACAGAGTTT CCAGAACGGT TACAATTCAC GAGACTTGAG GACTTCTTTA CCGACGGACT GTCCACACGT TCAAAACGAC TGTTCTAAGT TTTTCTTCAA 2801CACAATATCA AAGGCATGGC CGGCGATCGA TGAGCGGCCG CAATTTAATT CCGGTTATTT TCCACCATAT TGCCGTCTTT TGGCAATGTG AGGGCCCGGA GTGTTATAGT TTCCGTACCG GCCGCTAGCT ACTCGCCGGC GTTAAATTAA GGCCAATAAA AGGTGGTATA ACGGCAGAAA ACCGTTACAC TCCCGGGCCT 2901AACCTGGCCC TGTCTTCTTG ACGAGCATTC CTAGGGGTCT TTCCCCTCTC GCCAAAGGAA TGCAAGGTCT GTTGAATGTC GTGAAGGAAG CAGTTCCTCT TTGGACCGGG ACAGAAGAAC TGCTCGTAAG GATCCCCAGA AAGGGGAGAG CGGTTTCCTT ACGTTCCAGA CAACTTACAG CACTTCCTTC GTCAAGGAGA 3001GGAAGCTTCT TGAAGACAAA CAACGTCTGT AGCGACCCTT TGCAGGCAGC GGAACCCCCC ACCTGGCGAC AGGTGCCTCT GCGGCCAAAA GCCACGTGTA CCTTCGAAGA ACTTCTGTTT GTTGCAGACA TCGCTGGGAA ACGTCCGTCG CCTTGGGGGG TGGACCGCTG TCCACGGAGA CGCCGGTTTT CGGTGCACAT 3101TAAGATACAC CTGCAAAGGC GGCACAACCC CAGTGCCACG TTGTGAGTTG GATAGTTGTG GAAAGAGTCA AATGGCTCAC CTCAAGCGTA TTCAACAAGG ATTCTATGTG GACGTTTCCG CCGTGTTGGG GTCACGGTGC AACACTCAAC CTATCAACAC CTTTCTCAGT TTACCGAGTG GAGTTCGCAT AAGTTGTTCC 3201GGCTGAAGGA TGCCCAGAAG GTACCCCATT GTATGGGATC TGATCTGGGG CCTCGGTGCA CATGCTTTAC ATGTGTTTAG TCGAGGTTAA AAAACGTCTA CCGACTTCCT ACGGGTCTTC CATGGGGTAA CATACCCTAG ACTAGACCCC GGAGCCACGT GTACGAAATG TACACAAATC AGCTCCAATT TTTTGCAGAT 3301GGCCCCCCGA ACCACGGGGA CGTGGTTTTC CTTTGAAAAA CACGATGATA ATATGGCCAC CACCCATACC TAGGCTTTTG CAAAGATCGA TCAAGAGACA CCGGGGGGCT TGGTGCCCCT GCACCAAAAG GAAACTTTTT GTGCTACTAT TATACCGGTG GTGGGTATGG ATCCGAAAAC GTTTCTAGCT AGTTCTCTGT 3401GGATGAGGAT CGTTTCGCAT GATTGAACAA GATGGATTGC ACGCAGGTTC TCCGGCCGCT TGGGTGGAGA GGCTATTCGG CTATGACTGG GCACAACAGA CCTACTCCTA GCAAAGCGTA CTAACTTGTT CTACCTAACG TGCGTCCAAG AGGCCGGCGA ACCCACCTCT CCGATAAGCC GATACTGACC CGTGTTGTCT 3501CAATCGGCTG CTCTGATGCC GCCGTGTTCC GGCTGTCAGC GCAGGGGCGC CCGGTTCTTT TTGTCAAGAC CGACCTGTCC GGTGCCCTGA ATGAACTGCA GTTAGCCGAC GAGACTACGG CGGCACAAGG CCGACAGTCG CGTCCCCGCG GGCCAAGAAA AACAGTTCTG GCTGGACAGG CCACGGGACT TACTTGACGT 3601AGACGAGGCA GCGCGGCTAT CGTGGCTGGC CACGACGGGC GTTCCTTGCG CAGCTGTGCT CGACGTTGTC ACTGAAGCGG GAAGGGACTG GCTGCTATTG TCTGCTCCGT CGCGCCGATA GCACCGACCG GTGCTGCCCG CAAGGAACGC GTCGACACGA GCTGCAACAG TGACTTCGCC CTTCCCTGAC CGACGATAAC 3701GGCGAAGTGC CGGGGCAGGA TCTCCTGTCA TCTCACCTTG CTCCTGCCGA GAAAGTATCC ATCATGGCTG ATGCAATGCG GCGGCTGCAT ACGCTTGATC CCGCTTCACG GCCCCGTCCT AGAGGACAGT AGAGTGGAAC GAGGACGGCT CTTTCATAGG TAGTACCGAC TACGTTACGC CGCCGACGTA TGCGAACTAG 3801CGGCTACCTG CCCATTCGAC CACCAAGCGA AACATCGCAT CGAGCGAGCA CGTACTCGGA TGGAAGCCGG TCTTGTCGAT CAGGATGATC TGGACGAAGA GCCGATGGAC GGGTAAGCTG GTGGTTCGCT TTGTAGCGTA GCTCGCTCGT GCATGAGCCT ACCTTCGGCC AGAACAGCTA GTCCTACTAG ACCTGCTTCT 3901GCATCAGGGG CTCGCGCCAG CCGAACTGTT CGCCAGGCTC AAGGCGAGCA TGCCCGACGG CGAGGATCTC GTCGTGACCC ATGGCGATGC CTGCTTGCCG CGTAGTCCCC GAGCGCGGTC GGCTTGACAA GCGGTCCGAG TTCCGCTCGT ACGGGCTGCC GCTCCTAGAG CAGCACTGGG TACCGCTACG GACGAACGGC 4001AATATCATGG TGGAAAATGG CCGCTTTTCT GGATTCATCG ACTGTGGCCG GCTGGGTGTG GCGGACCGCT ATCAGGACAT AGCGTTGGCT ACCCGTGATA TTATAGTACC ACCTTTTACC GGCGAAAAGA CCTAAGTAGC TGACACCGGC CGACCCACAC CGCCTGGCGA TAGTCCTGTA TCGCAACCGA TGGGCACTAT 4101TTGCTGAAGA GCTTGGCGGC GAATGGGCTG ACCGCTTCCT CGTGCTTTAC GGTATCGCCG CTCCCGATTC GCAGCGCATC GCCTTCTATC GCCTTCTTGA AACGACTTCT CGAACCGCCG CTTACCCGAC TGGCGAAGGA GCACGAAATG CCATAGCGGC GAGGGCTAAG CGTCGCGTAG CGGAAGATAG CGGAAGAACT 4201CGAGTTCTTC TGAGCGGGAC TCTGGGGTTC GGGCCGCACT CGAGCATAAA CTTGTTTATT GCAGCTTATA ATGGTTACAA ATAAAGCAAT AGCATCACAA GCTCAAGAAG ACTCGCCCTG AGACCCCAAG CCCGGCGTGA GCTCGTATTT GAACAAATAA CGTCGAATAT TACCAATGTT TATTTCGTTA TCGTAGTGTT                                                                                      I-SceI                                                                               ~~~~~~~~~~~~~~~~~~~~4301ATTTCACAAA TAAAGCATTT TTTTCACTGC ATTCTAGTTG TGGTTTGTCC AAACTCATCA ATGTATCTTA AGTAGGGATA ACAGGGTAAT TTTGTTAAAT TAAAGTGTTT ATTTCGTAAA AAAAGTGACG TAAGATCAAC ACCAAACAGG TTTGAGTAGT TACATAGAAT TCATCCCTAT TGTCCCATTA AAACAATTTA 4401CAGCTCATTT TTTAACCAAT AGGAACGCCA TCAAAAATAA TTCGCGTCTG GCCTTCCTGT AGCCAGCTTT CATCAACATT AAATGTGAGC GAGTAACAAC GTCGAGTAAA AAATTGGTTA TCCTTGCGGT AGTTTTTATT AAGCGCAGAC CGGAAGGACA TCGGTCGAAA GTAGTTGTAA TTTACACTCG CTCATTGTTG 4501CCGTCGGATT CTCCGTGGGA ACAAACGGCG GATTGACCGT AATGGGATAG GTTACGTTGG TGTAGATGGG CGCATCGTAA CCGTGCATCT GCCAGTTTGA GGCAGCCTAA GAGGCACCCT TGTTTGCCGC CTAACTGGCA TTACCCTATC CAATGCAACC ACATCTACCC GCGTAGCATT GGCACGTAGA CGGTCAAACT 4601GGGGACGACG ACCGTATCGG CCTCAGGAAG ATCGCACTCC AGCCAGCTTT CCGGCACCGC TTCTGGTGCC GGAAACCAGG CAAAGCGCCA TTCGCCATTC CCCCTGCTGC TGGCATAGCC GGAGTCCTTC TAGCGTGAGG TCGGTCGAAA GGCCGTGGCG AAGACCACGG CCTTTGGTCC GTTTCGCGGT AAGCGGTAAG 4701AGGCTGCGCA ACTGTTGGGA AGGGCGATCG GTGCGGGCCT CTTCGCTATT ACGCCAGCTG GCGAAAGGGG GATGTGCTGC AAGGCGATTA AGTTGGGTAA TCCGACGCGT TGACAACCCT TCCCGCTAGC CACGCCCGGA GAAGCGATAA TGCGGTCGAC CGCTTTCCCC CTACACGACG TTCCGCTAAT TCAACCCATT 4801CGCCAGGGTT TTCCCAGTCA CGACGTTGTA AAACGACGGC CAGTGAATTG CAATTCGTAA TCATGGTCAT AGCTGTTTCC TGTGTGAAAT TGTTATCCGC GCGGTCCCAA AAGGGTCAGT GCTGCAACAT TTTGCTGCCG GTCACTTAAC GTTAAGCATT AGTACCAGTA TCGACAAAGG ACACACTTTA ACAATAGGCG 4901TCACAATTCC ACACAACATA CGAGCCGGAA GCATAAAGTG TAAAGCCTGG GGTGCCTAAT GAGTGAGCTA ACTCACATTA ATTGCGTTGC GCTCACTGCCAGTGTTAAGG TGTGTTGTAT GCTCGGCCTT CGTATTTCAC ATTTCGGACC CCACGGATTA CTCACTCGAT TGAGTGTAAT TAACGCAACG CGAGTGACGG      I-SceI ~~~~~~~~~~~~~~~~~~~ 5001ATTACCCTGT TATCCCTAGT GAACCATCAC CCTAATCAAG TTTTTTGGGG TCGAGGTGCC GTAAAGCACT AAATCGGAAC CCTAAAGGGA GCCCCCGATTTAATGGGACA ATAGGGATCA CTTGGTAGTG GGATTAGTTC AAAAAACCCC AGCTCCACGG CATTTCGTGA TTTAGCCTTG GGATTTCCCT CGGGGGCTAA5101TAGAGCTTGA CGGGGAAAGC CGGCGAACGT GGCGAGAAAG GAAGGGAAGA AAGCGAAAGG AGCGGGCGCT AGGGCGCTGG CAAGTGTAGC GGTCACGCTGATCTCGAACT GCCCCTTTCG GCCGCTTGCA CCGCTCTTTC CTTCCCTTCT TTCGCTTTCC TCGCCCGCGA TCCCGCGACC GTTCACATCG CCAGTGCGAC5201CGCGTAACCA CCACACCCGC CGCGCTTAAT GCGCCGCTAC AGGGCGCGTC AGGTGGCACT TTTCGGGGAA ATGTGCGCGG AACCCCTATT TGTTTATTTTGCGCATTGGT GGTGTGGGCG GCGCGAATTA CGCGGCGATG TCCCGCGCAG TCCACCGTGA AAAGCCCCTT TACACGCGCC TTGGGGATAA ACAAATAAAA5301TCTAAATACA TTCAAATATG TATCCGCTCA TGAGACAATA ACCCTGATAA ATGCTTCAAT AATAACGACC GGTAATGAAA AAGGAAGAGT ATGAGTATTCAGATTTATGT AAGTTTATAC ATAGGCGAGT ACTCTGTTAT TGGGACTATT TACGAAGTTA TTATTGCTGG CCATTACTTT TTCCTTCTCA TACTCATAAG5401AACATTTCCG TGTCGCCCTT ATTCCCTTTT TTGCGGCATT TTGCCTTCCT GTTTTTGCTC ACCCAGAAAC GCTGGTGAAA GTAAAAGATG CTGAAGATCATTGTAAAGGC ACAGCGGGAA TAAGGGAAAA AACGCCGTAA AACGGAAGGA CAAAAACGAG TGGGTCTTTG CGACCACTTT CATTTTCTAC GACTTCTAGT5501GTTGGGTGCA CGAGTGGGTT ACATCGAACT GGATCTCAAC AGCGGTAAGA TCCTTGAGAG TTTTCGCCCC GAAGAACGTT TTCCAATGAT GAGCACTTTTCAACCCACGT GCTCACCCAA TGTAGCTTGA CCTAGAGTTG TCGCCATTCT AGGAACTCTC AAAAGCGGGG CTTCTTGCAA AAGGTTACTA CTCGTGAAAA5601AAAGTTCTGC TATGTGGCGC GGTATTATCC CGTATTGACG CCGGGCAAGA GCAACTCGGT CGCCGCATAC ACTATTCTCA GAATGACTTG GTTGAGTCTATTTCAAGACG ATACACCGCG CCATAATAGG GCATAACTGC GGCCCGTTCT CGTTGAGCCA GCGGCGTATG TGATAAGAGT CTTACTGAAC CAACTCAGAT5701GCGTTGATCG GCACGTAAGA GGTTCCAACT TTCACCATAA TGAAATAAGA TCACTACCGG GCGTATTTTT TGAGTTATCG AGATTTTCAG GAGCTAAGGACGCAACTAGC CGTGCATTCT CCAAGGTTGA AAGTGGTATT ACTTTATTCT AGTGATGGCC CGCATAAAAA ACTCAATAGC TCTAAAAGTC CTCGATTCCT5801AGCTAAAATG GAGAAAAAAA TCACTGGATA TACCACCGTT GATATATCCC AATGGCATCG TAAAGAACAT TTTGAGGCAT TTCAGTCAGT TGCTCAATGTTCGATTTTAC CTCTTTTTTT AGTGACCTAT ATGGTGGCAA CTATATAGGG TTACCGTAGC ATTTCTTGTA AAACTCCGTA AAGTCAGTCA ACGAGTTACA5901ACCTATAACC AGACCGTTCA GCTGGATATT ACGGCCTTTT TAAAGACCGT AAAGAAAAAT AAGCACAAGT TTTATCCGGC CTTTATTCAC ATTCTTGCCCTGGATATTGG TCTGGCAAGT CGACCTATAA TGCCGGAAAA ATTTCTGGCA TTTCTTTTTA TTCGTGTTCA AAATAGGCCG GAAATAAGTG TAAGAACGGG6001GCCTGATGAA TGCTCATCCG GAATTCCGTA TGGCAATGAA AGACGGTGAG CTGGTGATAT GGGATAGTGT TCACCCTTGT TACACCGTTT TCCATGAGCACGGACTACTT ACGAGTAGGC CTTAAGGCAT ACCGTTACTT TCTGCCACTC GACCACTATA CCCTATCACA AGTGGGAACA ATGTGGCAAA AGGTACTCGT6101AACTGAAACG TTTTCATCGC TCTGGAGTGA ATACCACGAC GATTTCCGGC AGTTTCTACA CATATATTCG CAAGATGTGG CGTGTTACGG TGAAAACCTGTTGACTTTGC AAAAGTAGCG AGACCTCACT TATGGTGCTG CTAAAGGCCG TCAAAGATGT GTATATAAGC GTTCTACACC GCACAATGCC ACTTTTGGAC6201GCCTATTTCC CTAAAGGGTT TATTGAGAAT ATGTTTTTCG TATCAGCCAA TCCCTGGGTG AGTTTCACCA GTTTTGATTT AAACGTGGCC AATATGGACACGGATAAAGG GATTTCCCAA ATAACTCTTA TACAAAAAGC ATAGTCGGTT AGGGACCCAC TCAAAGTGGT CAAAACTAAA TTTGCACCGG TTATACCTGT6301ACTTCTTCGC CCCCGTTTTC ACCATGGGCA AATATTATAC GCAAGGCGAC AAGGTGCTGA TGCCGCTGGC GATTCAGGTT CATCATGCCG TCTGTGATGGTGAAGAAGCG GGGGCAAAAG TGGTACCCGT TTATAATATG CGTTCCGCTG TTCCACGACT ACGGCGACCG CTAAGTCCAA GTAGTACGGC AGACACTACC 6401CTTCCATGTC GGCAGAATGC TTAATGAATT ACAACAGTAC TGCGATGAGT GGCAGGGCGG GGCGTAATTT TTTTAAGGCA GTTATTGGTG CCCTTAAACG GAAGGTACAG CCGTCTTACG AATTACTTAA TGTTGTCATG ACGCTACTCA CCGTCCCGCC CCGCATTAAA AAAATTCCGT CAATAACCAC GGGAATTTGC 6501CCTGGTGCTA CGCCTGAATA AGTGATAATA AGCGGATGAA TGGCAGAAAT TCGAAATGAC CGACCAAGCG ACGCCCAACC TGCCATCACG AGATTTCGAT GGACCACGAT GCGGACTTAT TCACTATTAT TCGCCTACTT ACCGTCTTTA AGCTTTACTG GCTGGTTCGC TGCGGGTTGG ACGGTAGTGC TCTAAAGCTA 6601TCCACCGCCG CCTTCTATGA AAGGTTGGGC TTCGGAATCG TTTTCCGGGA CGCCGGCTGG ATGATCCTCC AGCGCGGGGA TCTCATGCTG GAGTTCTTCG AGGTGGCGGC GGAAGATACT TTCCAACCCG AAGCCTTAGC AAAAGGCCCT GCGGCCGACC TACTAGGAGG TCGCGCCCCT AGAGTACGAC CTCAAGAAGC 6701CCCACCCTAG GGGGAGGCTA ACTGAAACAC GGAAGGAGAC AATACCGGAA GGAACCCGCG CTATGACGGC AATAAAAAGA CAGAATAAAA CGCACGGTGT GGGTGGGATC CCCCTCCGAT TGACTTTGTG CCTTCCTCTG TTATGGCCTT CCTTGGGCGC GATACTGCCG TTATTTTTCT GTCTTATTTT GCGTGCCACA 6801TGGGTCGTTT GTTCATAAAC GCGGGGTTCG GTCCCAGGGC TGGCACTCTG TCGATACCCC ACCGAGACCC CATTGGGGCC AATACGCCCG CGTTTCTTCC ACCCAGCAAA CAAGTATTTG CGCCCCAAGC CAGGGTCCCG ACCGTGAGAC AGCTATGGGG TGGCTCTGGG GTAACCCCGG TTATGCGGGC GCAAAGAAGG 6901TTTTCCCCAC CCCACCCCCC AAGTTCGGGT GAAGGCCCAG GGCTCGCAGC CAACGTCGGG GCGGCAGGCC CTGCCATAGC CTCAGGTTAC TCATATATAC AAAAGGGGTG GGGTGGGGGG TTCAAGCCCA CTTCCGGGTC CCGAGCGTCG GTTGCAGCCC CGCCGTCCGG GACGGTATCG GAGTCCAATG AGTATATATG 7001TTTAGATTGA TTTAAAACTT CATTTTTAAT TTAAAAGGAT CTAGGTGAAG ATCCTTTTTG ATAATCTCAT GACCAAAATC CCTTAACGTG AGTTTTCGTT AAATCTAACT AAATTTTGAA GTAAAAATTA AATTTTCCTA GATCCACTTC TAGGAAAAAC TATTAGAGTA CTGGTTTTAG GGAATTGCAC TCAAAAGCAA 7101CCACTGAGCG TCAGACCCCG TAGAAAAGAT CAAAGGATCT TCTTGAGATC CTTTTTTTCT GCGCGTAATC TGCTGCTTGC AAACAAAAAA ACCACCGCTA GGTGACTCGC AGTCTGGGGC ATCTTTTCTA GTTTCCTAGA AGAACTCTAG GAAAAAAAGA CGCGCATTAG ACGACGAACG TTTGTTTTTT TGGTGGCGAT 7201CCAGCGGTGG TTTGTTTGCC GGATCAAGAG CTACCAACTC TTTTTCCGAA GGTAACTGGC TTCAGCAGAG CGCAGATACC AAATACTGTC CTTCTAGTGT GGTCGCCACC AAACAAACGG CCTAGTTCTC GATGGTTGAG AAAAAGGCTT CCATTGACCG AAGTCGTCTC GCGTCTATGG TTTATGACAG GAAGATCACA 7301AGCCGTAGTT AGGCCACCAC TTCAAGAACT CTGTAGCACC GCCTACATAC CTCGCTCTGC TAATCCTGTT ACCAGTGGCT GCTGCCAGTG GCGATAAGTC TCGGCATCAA TCCGGTGGTG AAGTTCTTGA GACATCGTGG CGGATGTATG GAGCGAGACG ATTAGGACAA TGGTCACCGA CGACGGTCAC CGCTATTCAG 7401GTGTCTTACC GGGTTGGACT CAAGACGATA GTTACCGGAT AAGGCGCAGC GGTCGGGCTG AACGGGGGGT TCGTGCACAC AGCCCAGCTT GGAGCGAACG CACAGAATGG CCCAACCTGA GTTCTGCTAT CAATGGCCTA TTCCGCGTCG CCAGCCCGAC TTGCCCCCCA AGCACGTGTG TCGGGTCGAA CCTCGCTTGC 7501ACCTACACCG AACTGAGATA CCTACAGCGT GAGCTATGAG AAAGCGCCAC GCTTCCCGAA GGGAGAAAGG CGGACAGGTA TCCGGTAAGC GGCAGGGTCG TGGATGTGGC TTGACTCTAT GGATGTCGCA CTCGATACTC TTTCGCGGTG CGAAGGGCTT CCCTCTTTCC GCCTGTCCAT AGGCCATTCG CCGTCCCAGC 7601GAACAGGAGA GCGCACGAGG GAGCTTCCAG GGGGAAACGC CTGGTATCTT TATAGTCCTG TCGGGTTTCG CCACCTCTGA CTTGAGCGTC GATTTTTGTG CTTGTCCTCT CGCGTGCTCC CTCGAAGGTC CCCCTTTGCG GACCATAGAA ATATCAGGAC AGCCCAAAGC GGTGGAGACT GAACTCGCAG CTAAAAACAC 7701ATGCTCGTCA GGGGGGCGGA GCCTATGGAA AAACGCCAGC AACGCGGCCT TTTTACGGTT CCTGGCCTTT TGCTGGCCTT TTGCTCACAT GTTCTTTCCT TACGAGCAGT CCCCCCGCCT CGGATACCTT TTTGCGGTCG TTGCGCCGGA AAAATGCCAA GGACCGGAAA ACGACCGGAA AACGAGTGTA CAAGAAAGGA 7801GCGTTATCCC CTGATTCTGT GGATAACCGT ATTACCGCCA TGCATTAGTT ATTAATAGTA ATCAATTACG GGGTCATTAG TTCATAGCCC ATATATGGAG CGCAATAGGG GACTAAGACA CCTATTGGCA TAATGGCGGT ACGTAATCAA TAATTATCAT TAGTTAATGC CCCAGTAATC AAGTATCGGG TATATACCTC 7901TTCCGCGTTA CATAACTTAC GGTAAATGGC CCGCCTGGCT GACCGCCCAA CGACCCCCGC CCATTGACGT CAATAATGAC GTATGTTCCC ATAGTAACGC AAGGCGCAAT GTATTGAATG CCATTTACCG GGCGGACCGA CTGGCGGGTT GCTGGGGGCG GGTAACTGCA GTTATTACTG CATACAAGGG TATCATTGCG 8001CAATAGGGAC TTTCCATTGA CGTCAATGGG TGGAGTATTT ACGGTAAACT GCCCACTTGG CAGTACATCA AGTGTATCAT ATGCCAAGTA CGCCCCCTAT GTTATCCCTG AAAGGTAACT GCAGTTACCC ACCTCATAAA TGCCATTTGA CGGGTGAACC GTCATGTAGT TCACATAGTA TACGGTTCAT GCGGGGGATA 8101TGACGTCAAT GACGGTAAAT GGCCCGCCTG GCATTATGCC CAGTACATGA CCTTATGGGA CTTTCCTACT TGGCAGTACA TCTACGTATT AGTCATCGCT ACTGCAGTTA CTGCCATTTA CCGGGCGGAC CGTAATACGG GTCATGTACT GGAATACCCT GAAAGGATGA ACCGTCATGT AGATGCATAA TCAGTAGCGA 8201ATTACCATGG TGATGCGGTT TTGGCAGTAC ATCAATGGGC GTGGATAGCG GTTTGACTCA CGGGGATTTC CAAGTCTCCA CCCCATTGAC GTCAATGGGA TAATGGTACC ACTACGCCAA AACCGTCATG TAGTTACCCG CACCTATCGC CAAACTGAGT GCCCCTAAAG GTTCAGAGGT GGGGTAACTG CAGTTACCCT 8301GTTTGTTTTG GCACCAAAAT CAACGGGACT TTCCAAAATG TCGTAACAAC TCCGCCCCAT TGACGCAAAT GGGCGGTAGG CGTGTACGGT GGGAGGTCTA CAAACAAAAC CGTGGTTTTA GTTGCCCTGA AAGGTTTTAC AGCATTGTTG AGGCGGGGTA ACTGCGTTTA CCCGCCATCC GCACATGCCA CCCTCCAGAT 8401 TATAAGCAGA GCT  ATATTCGTCT CGA  pVHentry-Hisbio1                         Esp3I                          ~~~~~~~ 1GGTTTAGTGA ACCGTCAGAT CCGCTAGACG TCTCATATAC CTGACTGGAA TACGACAGCT CCTGCAGCTT CTGGGCGAAG ACCACCGTGG CCCATTGCGT CCAAATCACT TGGCAGTCTA GGCGATCTGC AGAGTATATG GACTGACCTT ATGCTGTCGA GGACGTCGAA GACCCGCTTC TGGTGGCACC GGGTAACGCA 101ACTTAGCGAT AATCTGGTCC GCTTGGAAGT TAGCACGGCG AGCGCGCTCC AGAGCCAAGT CACGCAGCTT AACAGTACCT ACCGCAGAGC GGTGCATGAA TGAATCGCTA TTAGACCAGG CGAACCTTCA ATCGTGCCGC TCGCGCGAGG TCTCGGTTCA GTGCGTCGAA TTGTCATGGA TGGCGTCTCG CCACGTACTT 201CAGGCCGATA ACGTTGTCCT TAGCAACCTT GACATTACCC TCACCTTTAT TGGCAGGGAA GACGTGCTTC TGACCAGTAG TGCCCTCACG AGCGGTACCA GTCCGGCTAT TGCAACAGGA ATCGTTGGAA CTGTAATGGG AGTGGAAATA ACCGTCCCTT CTGCACGAAG ACTGGTCATC ACGGGAGTGC TCGCCATGGT 301GCACCACCAG CGGTGAGGTG CGGAACTTCT ACAACCTCAA AGCCCATAAC GTTGCGGATA GAACCCTTCT CAGGGTCAAT CAGAGCAGCG TAGTTTGCTG CGTGGTGGTC GCCACTCCAC GCCTTGAAGA TGTTGGAGTT TCGGGTATTG CAACGCCTAT CTTGGGAAGA GTCCCAGTTA GTCTCGTCGC ATCAAACGAC401CGTTCGGCAT CAGTGCTGCC AGAATCGCAG AGTAGCTATC TGGGTCACAG TAGAACACAC GGTCAGCAGC CGGAACATAG TTCTTGGTCA GAGCCGCACGGCAAGCCGTA GTCACGACGG TCTTAGCGTC TCATCGATAG ACCCAGTGTC ATCTTGTGTG CCAGTCGTCG GCCTTGTATC AAGAACCAGT CTCGGCGTGC501AGCCTTAGTC AGAGCCGCAA TAATCTCCTT ACCCAGCGCA ACTTGGTCGG TAAGTGCGGC CTTGTTCTGA GTGGTCTCAA TTACGGTAGC AGTACCTAAGTCGGAATCAG TCTCGGCGTT ATTAGAGGAA TGGGTCGCGT TGAACCAGCC ATTCACGCCG GAACAAGACT CACCAGAGTT AATGCCATCG TCATGGATTC601CCCTCGATGT TCTCATTATA TTTGCTTTCC ACGTTACACA GACCGGCAAT CTCAGCCAGA ACCGCACCAT CCGCAGCCAT CGCCAGAGAT TCACCCAACTGGGAGCTACA AGAGTAATAT AAACGAAAGG TGCAATGTGT CTGGCCGTTA GAGTCGGTCT TGGCGTGGTA GGCGTCGGTA GCGGTCTCTA AGTGGGTTGA701GAGAGGTATA CTCAGAGCGA ACGTCGTAGT GGTTCATCGC GTCCTCAATA TCATAAATCA GAACGTCAGC CGTCAGGAGA CCGTCAATGG TGATTACCTTCTCTCCATAT GAGTCTCGCT TGCAGCATCA CCAAGTAGCG CAGGAGTTAT AGTATTTAGT CTTGCAGTCG GCAGTCCTCT GGCAGTTACC ACTAATGGAA801CTCGGTGTGT TTGATGTCCT TACGTTTATC GTCGAGGTTC TCGCCCGGAG CCAGATACGC TGCCTGAGTG CGACCCAGAA CAGGGAACTG AGCGGATTTAGAGCCACACA AACTACAGGA ATGCAAATAG CAGCTCCAAG AGCGGGCCTC GGTCTATGCG ACGGACTCAC GCTGGGTCTT GTCCCTTGAC TCGCCTAAAT901CCGCTGGAGA TGGAACGTAC CATGTGGCGA GAAGTGGTCA CGGAGGTACG AGCGAACGCA GTCAGGACTT CACCGCCAAA TACCTTCAAG AACAACGCCAGGCGACCTCT ACCTTGCATG GTACACCGCT CTTCACCAGT GCCTCCATGC TCGCTTGCGT CAGTCCTGAA GTGGCGGTTT ATGGAAGTTC TTGTTGCGGT                                                                                                        Esp3I                                                                                                        ~~~~~1001GTTTATCTCC AGCAGCAACT ACACCTTTAC CTTGGTTAGT ACCCATTTGC TGTCCACCAG TCATGCTAGC CATATGTATA TCTCCTTCTT AAAGTCGTCTCAAATAGAGG TCGTCGTTGA TGTGGAAATG GAACCAATCA TGGGTAAACG ACAGGTGGTC AGTACGATCG GTATACATAT AGAGGAAGAA TTTCAGCAGAEsp3I ~ 1101CCAGTGCCTC CACCAAGGGC CCATCGGTCT TCCCCCTGGC GCCCTGCTCC AGGAGCACCT CCGAGAGCAC AGCGGCCCTG GGCTGCCTGG TCAAGGACTAGGTCACGGAG GTGGTTCCCG GGTAGCCAGA AGGGGGACCG CGGGACGAGG TCCTCGTGGA GGCTCTCGTG TCGCCGGGAC CCGACGGACC AGTTCCTGAT1201CTTCCCCGAA CCGGTGACGG TGTCGTGGAA CTCAGGCGCT CTGACCAGCG GCGTGCACAC CTTCCCAGCT GTCCTACAGT CCTCAGGACT CTACTCCCTCGAAGGGGCTT GGCCACTGCC ACAGCACCTT GAGTCCGCGA GACTGGTCGC CGCACGTGTG GAAGGGTCGA CAGGATGTCA GGAGTCCTGA GATGAGGGAG1301AGCAGCGTGG TGACCGTGCC CTCCAGCAGC TTGGGCACCC AGACCTACAT CTGCAACGTG AATCACAAGC CCAGCAACAC CAAGGTGGAC AAGAAAGTTGTCGTCGCACC ACTGGCACGG GAGGTCGTCG AACCCGTGGG TCTGGATGTA GACGTTGCAC TTAGTGTTCG GGTCGTTGTG GTTCCACCTG TTCTTTCAAC1401AGCCCAAATC TTGTGACAAA ACTCACACAT GCCCACCGTG CCCAGCACCT GAACTCCTGG GGGGACCGTC AGTCTTCCTC TTCCCCCCMA AACCCAAGGATCGGGTTTAG AACACTGTTT TGAGTGTGTA CGGGTGGCAC GGGTCGTGGA CTTGAGGACC CCCCTGGCAG TCAGAAGGAG AAGGGGGGKT TTGGGTTCCT 1501CACCCTCATG ATCTCCCGGA CCCCTGAGGT CACATGCGTG GTGGTGGACG TGAGCCACGA AGACCCTGAG GTCAAGTTCA ACTGGTACGT GGACGGCGTG GTGGGAGTAC TAGAGGGCCT GGGGACTCCA GTGTACGCAC CACCACCTGC ACTCGGTGCT TCTGGGACTC CAGTTCAAGT TGACCATGCA CCTGCCGCAC 1601GAGGTGCATA ATGCCAAGAC AAAGCCGCGG GAGGAGCAGT ACAACAGCAC GTACCGTGTG GTCAGCGTCC TCACCGTCCT GCACCAGGAC TGGCTGAATG CTCCACGTAT TACGGTTCTG TTTCGGCGCC CTCCTCGTCA TGTTGTCGTG CATGGCACAC CAGTCGCAGG AGTGGCAGGA CGTGGTCCTG ACCGACTTAC 1701GCAAGGAGTA CAAGTGCAAG GTCTCCAACA AAGCCCTCCC AGCCCCCATC GAGAAAACCA TCTCCAAAGC CAAAGGGCAG CCCCGAGAAC CACAGGTGTA CGTTCCTCAT GTTCACGTTC CAGAGGTTGT TTCGGGAGGG TCGGGGGTAG CTCTTTTGGT AGAGGTTTCG GTTTCCCGTC GGGGCTCTTG GTGTCCACAT 1801CACCCTGCCC CCATCCCGGG ATGAGCTGAC CAAGAACCAG GTCAGCCTGA CCTGCCTGGT CAAAGGCTTC TACCCCAGCG ACATCGCCGT GGAGTGGGAG GTGGGACGGG GGTAGGGCCC TACTCGACTG GTTCTTGGTC CAGTCGGACT GGACGGACCA GTTTCCGAAG ATGGGGTCGC TGTAGCGGCA CCTCACCCTC 1901AGCAATGGGC AGCCGGAGAA CAACTACAAG ACCACGCCTC CCATGCTGGA CTCCGACGGC TCCTTCTTCC TCTACAGCAA GCTCACCGTG GACAAGAGCA TCGTTACCCG TCGGCCTCTT GTTGATGTTC TGGTGCGGAG GGTACGACCT GAGGCTGCCG AGGAAGAAGG AGATGTCGTT CGAGTGGCAC CTGTTCTCGT 2001GGTGGCAGCA GGGGAACGTC TTCTCATGCT CCGTGATGCA TGAGGCTCTG CACAACCACT ACACGCAGAA GAGCCTCTCC CTGTCTCCGG GTAAAGGGTA CCACCGTCGT CCCCTTGCAG AAGAGTACGA GGCACTACGT ACTCCGAGAC GTGTTGGTGA TGTGCGTCTT CTCGGAGAGG GACAGAGGCC CATTTCCCAT 2101CATGTCCCAT ATGCTCGACA TGGCAAGCAG CCTGAGACAG ATTCTGGACT CCCAGAAAAT GGAGTGGAGG TCCAACGCCG GGGGCAGCGG TAGGGATAAG GTACAGGGTA TACGAGCTGT ACCGTTCGTC GGACTCTGTC TAAGACCTGA GGGTCTTTTA CCTCACCTCC AGGTTGCGGC CCCCGTCGCC ATCCCTATTC 2201TGGTCAGATC TTCGCATGGG CAGCAGCCAT CATCATCATC ATCACAGCAG CGGCATGGCA AGCAGCCTGA GACAGATTCT GGACTCCCAG AAAATGGAGT ACCAGTCTAG AAGCGTACCC GTCGTCGGTA GTAGTAGTAG TAGTGTCGTC GCCGTACCGT TCGTCGGACT CTGTCTAAGA CCTGAGGGTC TTTTACCTCA                                  I-SceI                           ~~~~~~~~~~~~~~~~~~~~ 2301GGAGGTCCAA CGCCGGGGGC AGCGGTAGGG ATAACAGGGT AATCCATATG CTCGAGGGGG CCAAGGCCGC GCCGGCCTGC AGGCATGCAA GCTTGGCGTA CCTCCAGGTT GCGGCCCCCG TCGCCATCCC TATTGTCCCA TTAGGTATAC GAGCTCCCCC GGTTCCGGCG CGGCCGGACG TCCGTACGTT CGAACCGCAT 2401ATCATGGTCA TAGCTGTTTC CTGTGTGAAA TTGTTATCCG CTCACAATTC CACACAACAT ACGAGCCGGA AGCATAAAGT GTAAAGCCTG GGGTGCCTAA TAGTACCAGT ATCGACAAAG GACACACTTT AACAATAGGC GAGTGTTAAG GTGTGTTGTA TGCTCGGCCT TCGTATTTCA CATTTCGGAC CCCACGGATT 2501TGAGTGAGCT AACTCACATT AATTGCGTTG CGCTCACTGC CCGCTTTCCA GTCGGGAAAC CTGTCGTGCC AGCGAGCTCG AATTGTTGAC ATTCCCGAAA ACTCACTCGA TTGAGTGTAA TTAACGCAAC GCGAGTGACG GGCGAAAGGT CAGCCCTTTG GACAGCACGG TCGCTCGAGC TTAACAACTG TAAGGGCTTT 2601TCTCTGGATT TAAGGAGATG GCACCCATGG AACAGTTCAT TGCTCAAGTT GATCGCTGCG CTTCCTGCAC TACTGGATGT CTCAAAGGTC TTGCCAATGT AGAGACCTAA ATTCCTCTAC CGTGGGTACC TTGTCAAGTA ACGAGTTCAA CTAGCGACGC GAAGGACGTG ATGACCTACA GAGTTTCCAG AACGGTTACA 2701TAAGTGCTCT GAACTCCTGA AGAAATGGCT GCCTGACAGG TGTGCAAGTT TTGCTGACAA GATTCAAAAA GAAGTTCACA ATATCAAAGG CATGGCCGGC ATTCACGAGA CTTGAGGACT TCTTTACCGA CGGACTGTCC ACACGTTCAA AACGACTGTT CTAAGTTTTT CTTCAAGTGT TATAGTTTCC GTACCGGCCG 2801GATCGATGAG CGGCCGCAAT TTAATTCCGG TTATTTTCCA CCATATTGCC GTCTTTTGGC AATGTGAGGG CCCGGAAACC TGGCCCTGTC TTCTTGACGA CTAGCTACTC GCCGGCGTTA AATTAAGGCC AATAAAAGGT GGTATAACGG CAGAAAACCG TTACACTCCC GGGCCTTTGG ACCGGGACAG AAGAACTGCT 2901GCATTCCTAG GGGTCTTTCC CCTCTCGCCA AAGGAATGCA AGGTCTGTTG AATGTCGTGA AGGAAGCAGT TCCTCTGGAA GCTTCTTGAA GACAAACAAC CGTAAGGATC CCCAGAAAGG GGAGAGCGGT TTCCTTACGT TCCAGACAAC TTACAGCACT TCCTTCGTCA AGGAGACCTT CGAAGAACTT CTGTTTGTTG 3001GTCTGTAGCG ACCCTTTGCA GGCAGCGGAA CCCCCCACCT GGCGACAGGT GCCTCTGCGG CCAAAAGCCA CGTGTATAAG ATACACCTGC AAAGGCGGCA CAGACATCGC TGGGAAACGT CCGTCGCCTT GGGGGGTGGA CCGCTGTCCA CGGAGACGCC GGTTTTCGGT GCACATATTC TATGTGGACG TTTCCGCCGT 3101CAACCCCAGT GCCACGTTGT GAGTTGGATA GTTGTGGAAA GAGTCAAATG GCTCACCTCA AGCGTATTCA ACAAGGGGCT GAAGGATGCC CAGAAGGTAC GTTGGGGTCA CGGTGCAACA CTCAACCTAT CAACACCTTT CTCAGTTTAC CGAGTGGAGT TCGCATAAGT TGTTCCCCGA CTTCCTACGG GTCTTCCATG 3201CCCATTGTAT GGGATCTGAT CTGGGGCCTC GGTGCACATG CTTTACATGT GTTTAGTCGA GGTTAAAAAA CGTCTAGGCC CCCCGAACCA CGGGGACGTG GGGTAACATA CCCTAGACTA GACCCCGGAG CCACGTGTAC GAAATGTACA CAAATCAGCT CCAATTTTTT GCAGATCCGG GGGGCTTGGT GCCCCTGCAC 3301GTTTTCCTTT GAAAAACACG ATGATAATAT GGCCACCACC CATACCTAGG CTTTTGCAAA GATCGATCAA GAGACAGGAT GAGGATCGTT TCGCATGATT CAAAAGGAAA CTTTTTGTGC TACTATTATA CCGGTGGTGG GTATGGATCC GAAAACGTTT CTAGCTAGTT CTCTGTCCTA CTCCTAGCAA AGCGTACTAA 3401GAACAAGATG GATTGCACGC AGGTTCTCCG GCCGCTTGGG TGGAGAGGCT ATTCGGCTAT GACTGGGCAC AACAGACAAT CGGCTGCTCT GATGCCGCCG CTTGTTCTAC CTAACGTGCG TCCAAGAGGC CGGCGAACCC ACCTCTCCGA TAAGCCGATA CTGACCCGTG TTGTCTGTTA GCCGACGAGA CTACGGCGGC 3501TGTTCCGGCT GTCAGCGCAG GGGCGCCCGG TTCTTTTTGT CAAGACCGAC CTGTCCGGTG CCCTGAATGA ACTGCAAGAC GAGGCAGCGC GGCTATCGTG ACAAGGCCGA CAGTCGCGTC CCCGCGGGCC AAGAAAAACA GTTCTGGCTG GACAGGCCAC GGGACTTACT TGACGTTCTG CTCCGTCGCG CCGATAGCAC 3601GCTGGCCACG ACGGGCGTTC CTTGCGCAGC TGTGCTCGAC GTTGTCACTG AAGCGGGAAG GGACTGGCTG CTATTGGGCG AAGTGCCGGG GCAGGATCTC CGACCGGTGC TGCCCGCAAG GAACGCGTCG ACACGAGCTG CAACAGTGAC TTCGCCCTTC CCTGACCGAC GATAACCCGC TTCACGGCCC CGTCCTAGAG 3701CTGTCATCTC ACCTTGCTCC TGCCGAGAAA GTATCCATCA TGGCTGATGC AATGCGGCGG CTGCATACGC TTGATCCGGC TACCTGCCCA TTCGACCACC GACAGTAGAG TGGAACGAGG ACGGCTCTTT CATAGGTAGT ACCGACTACG TTACGCCGCC GACGTATGCG AACTAGGCCG ATGGACGGGT AAGCTGGTGG 3801AAGCGAAACA TCGCATCGAG CGAGCACGTA CTCGGATGGA AGCCGGTCTT GTCGATCAGG ATGATCTGGA CGAAGAGCAT CAGGGGCTCG CGCCAGCCGATTCGCTTTGT AGCGTAGCTC GCTCGTGCAT GAGCCTACCT TCGGCCAGAA CAGCTAGTCC TACTAGACCT GCTTCTCGTA GTCCCCGAGC GCGGTCGGCT 3901ACTGTTCGCC AGGCTCAAGG CGAGCATGCC CGACGGCGAG GATCTCGTCG TGACCCATGG CGATGCCTGC TTGCCGAATA TCATGGTGGA AAATGGCCGCTGACAAGCGG TCCGAGTTCC GCTCGTACGG GCTGCCGCTC CTAGAGCAGC ACTGGGTACC GCTACGGACG AACGGCTTAT AGTACCACCT TTTACCGGCG4001TTTTCTGGAT TCATCGACTG TGGCCGGCTG GGTGTGGCGG ACCGCTATCA GGACATAGCG TTGGCTACCC GTGATATTGC TGAAGAGCTT GGCGGCGAATAAAAGACCTA AGTAGCTGAC ACCGGCCGAC CCACACCGCC TGGCGATAGT CCTGTATCGC AACCGATGGG CACTATAACG ACTTCTCGAA CCGCCGCTTA4101GGGCTGACCG CTTCCTCGTG CTTTACGGTA TCGCCGCTCC CGATTCGCAG CGCATCGCCT TCTATCGCCT TCTTGACGAG TTCTTCTGAG CGGGACTCTGCCCGACTGGC GAAGGAGCAC GAAATGCCAT AGCGGCGAGG GCTAAGCGTC GCGTAGCGGA AGATAGCGGA AGAACTGCTC AAGAAGACTC GCCCTGAGAC4201GGGTTCGGGC CGCACTCGAG CATAAACTTG TTTATTGCAG CTTATAATGG TTACAAATAA AGCAATAGCA TCACAAATTT CACAAATAAA GCATTTTTTTCCCAAGCCCG GCGTGAGCTC GTATTTGAAC AAATAACGTC GAATATTACC AATGTTTATT TCGTTATCGT AGTGTTTAAA GTGTTTATTT CGTAAAAAAA                                                          I-SceI                                                    ~~~~~~~~~~~~~~ 4301CACTGCATTC TAGTTGTGGT TTGTCCAAAC TCATCAATGT ATCTTAAGTA GGGATAACAG GGTAATTTTG TTAAATCAGC TCATTTTTTA ACCAATAGGAGTGACGTAAG ATCAACACCA AACAGGTTTG AGTAGTTACA TAGAATTCAT CCCTATTGTC CCATTAAAAC AATTTAGTCG AGTAAAAAAT TGGTTATCCT4401ACGCCATCAA AAATAATTCG CGTCTGGCCT TCCTGTAGCC AGCTTTCATC AACATTAAAT GTGAGCGAGT AACAACCCGT CGGATTCTCC GTGGGAACAATGCGGTAGTT TTTATTAAGC GCAGACCGGA AGGACATCGG TCGAAAGTAG TTGTAATTTA CACTCGCTCA TTGTTGGGCA GCCTAAGAGG CACCCTTGTT4501ACGGCGGATT GACCGTAATG GGATAGGTTA CGTTGGTGTA GATGGGCGCA TCGTAACCGT GCATCTGCCA GTTTGAGGGG ACGACGACCG TATCGGCCTCTGCCGCCTAA CTGGCATTAC CCTATCCAAT GCAACCACAT CTACCCGCGT AGCATTGGCA CGTAGACGGT CAAACTCCCC TGCTGCTGGC ATAGCCGGAG4601AGGAAGATCG CACTCCAGCC AGCTTTCCGG CACCGCTTCT GGTGCCGGAA ACCAGGCAAA GCGCCATTCG CCATTCAGGC TGCGCAACTG TTGGGAAGGGTCCTTCTAGC GTGAGGTCGG TCGAAAGGCC GTGGCGAAGA CCACGGCCTT TGGTCCGTTT CGCGGTAAGC GGTAAGTCCG ACGCGTTGAC AACCCTTCCC4701CGATCGGTGC GGGCCTCTTC GCTATTACGC CAGCTGGCGA AAGGGGGATG TGCTGCAAGG CGATTAAGTT GGGTAACGCC AGGGTTTTCC CAGTCACGACGCTAGCCACG CCCGGAGAAG CGATAATGCG GTCGACCGCT TTCCCCCTAC ACGACGTTCC GCTAATTCAA CCCATTGCGG TCCCAAAAGG GTCAGTGCTG4801GTTGTAAAAC GACGGCCAGT GAATTGCAAT TCGTAATCAT GGTCATAGCT GTTTCCTGTG TGAAATTGTT ATCCGCTCAC AATTCCACAC AACATACGAGCAACATTTTG CTGCCGGTCA CTTAACGTTA AGCATTAGTA CCAGTATCGA CAAAGGACAC ACTTTAACAA TAGGCGAGTG TTAAGGTGTG TTGTATGCTC                                                                                         I-SceI                                                                                   ~~~~~~~~~~~~~~~~~~~~4901CCGGAAGCAT AAAGTGTAAA GCCTGGGGTG CCTAATGAGT GAGCTAACTC ACATTAATTG CGTTGCGCTC ACTGCCATTA CCCTGTTATC CCTAGTGAACGGCCTTCGTA TTTCACATTT CGGACCCCAC GGATTACTCA CTCGATTGAG TGTAATTAAC GCAACGCGAG TGACGGTAAT GGGACAATAG GGATCACTTG5001CATCACCCTA ATCAAGTTTT TTGGGGTCGA GGTGCCGTAA AGCACTAAAT CGGAACCCTA AAGGGAGCCC CCGATTTAGA GCTTGACGGG GAAAGCCGGCGTAGTGGGAT TAGTTCAAAA AACCCCAGCT CCACGGCATT TCGTGATTTA GCCTTGGGAT TTCCCTCGGG GGCTAAATCT CGAACTGCCC CTTTCGGCCG5101GAACGTGGCG AGAAAGGAAG GGAAGAAAGC GAAAGGAGCG GGCGCTAGGG CGCTGGCAAG TGTAGCGGTC ACGCTGCGCG TAACCACCAC ACCCGCCGCGCTTGCACCGC TCTTTCCTTC CCTTCTTTCG CTTTCCTCGC CCGCGATCCC GCGACCGTTC ACATCGCCAG TGCGACGCGC ATTGGTGGTG TGGGCGGCGC5201CTTAATGCGC CGCTACAGGG CGCGTCAGGT GGCACTTTTC GGGGAAATGT GCGCGGAACC CCTATTTGTT TATTTTTCTA AATACATTCA AATATGTATC GAATTACGCG GCGATGTCCC GCGCAGTCCA CCGTGAAAAG CCCCTTTACA CGCGCCTTGG GGATAAACAA ATAAAAAGAT TTATGTAAGT TTATACATAG 5301CGCTCATGAG ACAATAACCC TGATAAATGC TTCAATAATA ACGACCGGTA ATGAAAAAGG AAGAGTATGA GTATTCAACA TTTCCGTGTC GCCCTTATTC GCGAGTACTC TGTTATTGGG ACTATTTACG AAGTTATTAT TGCTGGCCAT TACTTTTTCC TTCTCATACT CATAAGTTGT AAAGGCACAG CGGGAATAAG 5401CCTTTTTTGC GGCATTTTGC CTTCCTGTTT TTGCTCACCC AGAAACGCTG GTGAAAGTAA AAGATGCTGA AGATCAGTTG GGTGCACGAG TGGGTTACAT GGAAAAAACG CCGTAAAACG GAAGGACAAA AACGAGTGGG TCTTTGCGAC CACTTTCATT TTCTACGACT TCTAGTCAAC CCACGTGCTC ACCCAATGTA 5501CGAACTGGAT CTCAACAGCG GTAAGATCCT TGAGAGTTTT CGCCCCGAAG AACGTTTTCC AATGATGAGC ACTTTTAAAG TTCTGCTATG TGGCGCGGTA GCTTGACCTA GAGTTGTCGC CATTCTAGGA ACTCTCAAAA GCGGGGCTTC TTGCAAAAGG TTACTACTCG TGAAAATTTC AAGACGATAC ACCGCGCCAT 5601TTATCCCGTA TTGACGCCGG GCAAGAGCAA CTCGGTCGCC GCATACACTA TTCTCAGAAT GACTTGGTTG AGTCTAGCGT TGATCGGCAC GTAAGAGGTT AATAGGGCAT AACTGCGGCC CGTTCTCGTT GAGCCAGCGG CGTATGTGAT AAGAGTCTTA CTGAACCAAC TCAGATCGCA ACTAGCCGTG CATTCTCCAA 5701CCAACTTTCA CCATAATGAA ATAAGATCAC TACCGGGCGT ATTTTTTGAG TTATCGAGAT TTTCAGGAGC TAAGGAAGCT AAAATGGAGA AAAAAATCAC GGTTGAAAGT GGTATTACTT TATTCTAGTG ATGGCCCGCA TAAAAAACTC AATAGCTCTA AAAGTCCTCG ATTCCTTCGA TTTTACCTCT TTTTTTAGTG 5801TGGATATACC ACCGTTGATA TATCCCAATG GCATCGTAAA GAACATTTTG AGGCATTTCA GTCAGTTGCT CAATGTACCT ATAACCAGAC CGTTCAGCTG ACCTATATGG TGGCAACTAT ATAGGGTTAC CGTAGCATTT CTTGTAAAAC TCCGTAAAGT CAGTCAACGA GTTACATGGA TATTGGTCTG GCAAGTCGAC 5901GATATTACGG CCTTTTTAAA GACCGTAAAG AAAAATAAGC ACAAGTTTTA TCCGGCCTTT ATTCACATTC TTGCCCGCCT GATGAATGCT CATCCGGAAT CTATAATGCC GGAAAAATTT CTGGCATTTC TTTTTATTCG TGTTCAAAAT AGGCCGGAAA TAAGTGTAAG AACGGGCGGA CTACTTACGA GTAGGCCTTA 6001TCCGTATGGC AATGAAAGAC GGTGAGCTGG TGATATGGGA TAGTGTTCAC CCTTGTTACA CCGTTTTCCA TGAGCAAACT GAAACGTTTT CATCGCTCTG AGGCATACCG TTACTTTCTG CCACTCGACC ACTATACCCT ATCACAAGTG GGAACAATGT GGCAAAAGGT ACTCGTTTGA CTTTGCAAAA GTAGCGAGAC 6101GAGTGAATAC CACGACGATT TCCGGCAGTT TCTACACATA TATTCGCAAG ATGTGGCGTG TTACGGTGAA AACCTGGCCT ATTTCCCTAA AGGGTTTATT CTCACTTATG GTGCTGCTAA AGGCCGTCAA AGATGTGTAT ATAAGCGTTC TACACCGCAC AATGCCACTT TTGGACCGGA TAAAGGGATT TCCCAAATAA 6201GAGAATATGT TTTTCGTATC AGCCAATCCC TGGGTGAGTT TCACCAGTTT TGATTTAAAC GTGGCCAATA TGGACAACTT CTTCGCCCCC GTTTTCACCA CTCTTATACA AAAAGCATAG TCGGTTAGGG ACCCACTCAA AGTGGTCAAA ACTAAATTTG CACCGGTTAT ACCTGTTGAA GAAGCGGGGG CAAAAGTGGT 6301TGGGCAAATA TTATACGCAA GGCGACAAGG TGCTGATGCC GCTGGCGATT CAGGTTCATC ATGCCGTCTG TGATGGCTTC CATGTCGGCA GAATGCTTAA ACCCGTTTAT AATATGCGTT CCGCTGTTCC ACGACTACGG CGACCGCTAA GTCCAAGTAG TACGGCAGAC ACTACCGAAG GTACAGCCGT CTTACGAATT 6401TGAATTACAA CAGTACTGCG ATGAGTGGCA GGGCGGGGCG TAATTTTTTT AAGGCAGTTA TTGGTGCCCT TAAACGCCTG GTGCTACGCC TGAATAAGTG ACTTAATGTT GTCATGACGC TACTCACCGT CCCGCCCCGC ATTAAAAAAA TTCCGTCAAT AACCACGGGA ATTTGCGGAC CACGATGCGG ACTTATTCAC 6501ATAATAAGCG GATGAATGGC AGAAATTCGA AATGACCGAC CAAGCGACGC CCAACCTGCC ATCACGAGAT TTCGATTCCA CCGCCGCCTT CTATGAAAGG TATTATTCGC CTACTTACCG TCTTTAAGCT TTACTGGCTG GTTCGCTGCG GGTTGGACGG TAGTGCTCTA AAGCTAAGGT GGCGGCGGAA GATACTTTCC 6601TTGGGCTTCG GAATCGTTTT CCGGGACGCC GGCTGGATGA TCCTCCAGCG CGGGGATCTC ATGCTGGAGT TCTTCGCCCA CCCTAGGGGG AGGCTAACTG AACCCGAAGC CTTAGCAAAA GGCCCTGCGG CCGACCTACT AGGAGGTCGC GCCCCTAGAG TACGACCTCA AGAAGCGGGT GGGATCCCCC TCCGATTGAC 6701AAACACGGAA GGAGACAATA CCGGAAGGAA CCCGCGCTAT GACGGCAATA AAAAGACAGA ATAAAACGCA CGGTGTTGGG TCGTTTGTTC ATAAACGCGG TTTGTGCCTT CCTCTGTTAT GGCCTTCCTT GGGCGCGATA CTGCCGTTAT TTTTCTGTCT TATTTTGCGT GCCACAACCC AGCAAACAAG TATTTGCGCC 6801GGTTCGGTCC CAGGGCTGGC ACTCTGTCGA TACCCCACCG AGACCCCATT GGGGCCAATA CGCCCGCGTT TCTTCCTTTT CCCCACCCCA CCCCCCAAGT CCAAGCCAGG GTCCCGACCG TGAGACAGCT ATGGGGTGGC TCTGGGGTAA CCCCGGTTAT GCGGGCGCAA AGAAGGAAAA GGGGTGGGGT GGGGGGTTCA 6901TCGGGTGAAG GCCCAGGGCT CGCAGCCAAC GTCGGGGCGG CAGGCCCTGC CATAGCCTCA GGTTACTCAT ATATACTTTA GATTGATTTA AAACTTCATT AGCCCACTTC CGGGTCCCGA GCGTCGGTTG CAGCCCCGCC GTCCGGGACG GTATCGGAGT CCAATGAGTA TATATGAAAT CTAACTAAAT TTTGAAGTAA 7001TTTAATTTAA AAGGATCTAG GTGAAGATCC TTTTTGATAA TCTCATGACC AAAATCCCTT AACGTGAGTT TTCGTTCCAC TGAGCGTCAG ACCCCGTAGA AAATTAAATT TTCCTAGATC CACTTCTAGG AAAAACTATT AGAGTACTGG TTTTAGGGAA TTGCACTCAA AAGCAAGGTG ACTCGCAGTC TGGGGCATCT 7101AAAGATCAAA GGATCTTCTT GAGATCCTTT TTTTCTGCGC GTAATCTGCT GCTTGCAAAC AAAAAAACCA CCGCTACCAG CGGTGGTTTG TTTGCCGGAT TTTCTAGTTT CCTAGAAGAA CTCTAGGAAA AAAAGACGCG CATTAGACGA CGAACGTTTG TTTTTTTGGT GGCGATGGTC GCCACCAAAC AAACGGCCTA 7201CAAGAGCTAC CAACTCTTTT TCCGAAGGTA ACTGGCTTCA GCAGAGCGCA GATACCAAAT ACTGTCCTTC TAGTGTAGCC GTAGTTAGGC CACCACTTCA GTTCTCGATG GTTGAGAAAA AGGCTTCCAT TGACCGAAGT CGTCTCGCGT CTATGGTTTA TGACAGGAAG ATCACATCGG CATCAATCCG GTGGTGAAGT 7301AGAACTCTGT AGCACCGCCT ACATACCTCG CTCTGCTAAT CCTGTTACCA GTGGCTGCTG CCAGTGGCGA TAAGTCGTGT CTTACCGGGT TGGACTCAAG TCTTGAGACA TCGTGGCGGA TGTATGGAGC GAGACGATTA GGACAATGGT CACCGACGAC GGTCACCGCT ATTCAGCACA GAATGGCCCA ACCTGAGTTC 7401ACGATAGTTA CCGGATAAGG CGCAGCGGTC GGGCTGAACG GGGGGTTCGT GCACACAGCC CAGCTTGGAG CGAACGACCT ACACCGAACT GAGATACCTA TGCTATCAAT GGCCTATTCC GCGTCGCCAG CCCGACTTGC CCCCCAAGCA CGTGTGTCGG GTCGAACCTC GCTTGCTGGA TGTGGCTTGA CTCTATGGAT 7501CAGCGTGAGC TATGAGAAAG CGCCACGCTT CCCGAAGGGA GAAAGGCGGA CAGGTATCCG GTAAGCGGCA GGGTCGGAAC AGGAGAGCGC ACGAGGGAGC GTCGCACTCG ATACTCTTTC GCGGTGCGAA GGGCTTCCCT CTTTCCGCCT GTCCATAGGC CATTCGCCGT CCCAGCCTTG TCCTCTCGCG TGCTCCCTCG 7601TTCCAGGGGG AAACGCCTGG TATCTTTATA GTCCTGTCGG GTTTCGCCAC CTCTGACTTG AGCGTCGATT TTTGTGATGC TCGTCAGGGG GGCGGAGCTTAAGGTCCCCC TTTGCGGACC ATAGAAATAT CAGGACAGCC CAAAGCGGTG GAGACTGAAC TCGCAGCTAA AAACACTACG AGCAGTCCCC CCGCCTCGGA7701ATGGAAAAAC GCCAGCAACG CGGCCTTTTT ACGGTTCCTG GCCTTTTGCT GGCCTTTTGC TCACATGTTC TTTCCTGCGT TATCCCCTGA TTCTGTGGATTACCTTTTTG CGGTCGTTGC GCCGGAAAAA TGCCAAGGAC CGGAAAACGA CCGGAAAACG AGTGTACAAG AAAGGACGCA ATAGGGGACT AAGACACCTA7801AACCGTATTA CCGCCATGCA TTAGTTATTA ATAGTAATCA ATTACGGGGT CATTAGTTCA TAGCCCATAT ATGGAGTTCC GCGTTACATA ACTTACGGTATTGGCATAAT GGCGGTACGT AATCAATAAT TATCATTAGT TAATGCCCCA GTAATCAAGT ATCGGGTATA TACCTCAAGG CGCAATGTAT TGAATGCCAT7901AATGGCCCGC CTGGCTGACC GCCCAACGAC CCCCGCCCAT TGACGTCAAT AATGACGTAT GTTCCCATAG TAACGCCAAT AGGGACTTTC CATTGACGTCTTACCGGGCG GACCGACTGG CGGGTTGCTG GGGGCGGGTA ACTGCAGTTA TTACTGCATA CAAGGGTATC ATTGCGGTTA TCCCTGAAAG GTAACTGCAG8001AATGGGTGGA GTATTTACGG TAAACTGCCC ACTTGGCAGT ACATCAAGTG TATCATATGC CAAGTACGCC CCCTATTGAC GTCAATGACG GTAAATGGCCTTACCCACCT CATAAATGCC ATTTGACGGG TGAACCGTCA TGTAGTTCAC ATAGTATACG GTTCATGCGG GGGATAACTG CAGTTACTGC CATTTACCGG8101CGCCTGGCAT TATGCCCAGT ACATGACCTT ATGGGACTTT CCTACTTGGC AGTACATCTA CGTATTAGTC ATCGCTATTA CCATGGTGAT GCGGTTTTGGGCGGACCGTA ATACGGGTCA TGTACTGGAA TACCCTGAAA GGATGAACCG TCATGTAGAT GCATAATCAG TAGCGATAAT GGTACCACTA CGCCAAAACC8201CAGTACATCA ATGGGCGTGG ATAGCGGTTT GACTCACGGG GATTTCCAAG TCTCCACCCC ATTGACGTCA ATGGGAGTTT GTTTTGGCAC CAAAATCAACGTCATGTAGT TACCCGCACC TATCGCCAAA CTGAGTGCCC CTAAAGGTTC AGAGGTGGGG TAACTGCAGT TACCCTCAAA CAAAACCGTG GTTTTAGTTG8301GGGACTTTCC AAAATGTCGT AACAACTCCG CCCCATTGAC GCAAATGGGC GGTAGGCGTG TACGGTGGGA GGTCTATATA AGCAGAGCTCCCTGAAAGG TTTTACAGCA TTGTTGAGGC GGGGTAACTG CGTTTACCCG CCATCCGCAC ATGCCACCCT CCAGATATAT TCGTCTCGApVHentry-CBD1                               Esp3I                              ~~~~~~~ 1GGTTTAGTGA ACCGTCAGAT CCGCTAGACG TCTCATATAC CTGACTGGAA TACGACAGCT CCTGCAGCTT CTGGGCGAAG ACCACCGTGG CCCATTGCGTCCAAATCACT TGGCAGTCTA GGCGATCTGC AGAGTATATG GACTGACCTT ATGCTGTCGA GGACGTCGAA GACCCGCTTC TGGTGGCACC GGGTAACGCA101ACTTAGCGAT AATCTGGTCC GCTTGGAAGT TAGCACGGCG AGCGCGCTCC AGAGCCAAGT CACGCAGCTT AACAGTACCT ACCGCAGAGC GGTGCATGAATGAATCGCTA TTAGACCAGG CGAACCTTCA ATCGTGCCGC TCGCGCGAGG TCTCGGTTCA GTGCGTCGAA TTGTCATGGA TGGCGTCTCG CCACGTACTT201CAGGCCGATA ACGTTGTCCT TAGCAACCTT GACATTACCC TCACCTTTAT TGGCAGGGAA GACGTGCTTC TGACCAGTAG TGCCCTCACG AGCGGTACCAGTCCGGCTAT TGCAACAGGA ATCGTTGGAA CTGTAATGGG AGTGGAAATA ACCGTCCCTT CTGCACGAAG ACTGGTCATC ACGGGAGTGC TCGCCATGGT301GCACCACCAG CGGTGAGGTG CGGAACTTCT ACAACCTCAA AGCCCATAAC GTTGCGGATA GAACCCTTCT CAGGGTCAAT CAGAGCAGCG TAGTTTGCTGCGTGGTGGTC GCCACTCCAC GCCTTGAAGA TGTTGGAGTT TCGGGTATTG CAACGCCTAT CTTGGGAAGA GTCCCAGTTA GTCTCGTCGC ATCAAACGAC401CGTTCGGCAT CAGTGCTGCC AGAATCGCAG AGTAGCTATC TGGGTCACAG TAGAACACAC GGTCAGCAGC CGGAACATAG TTCTTGGTCA GAGCCGCACGGCAAGCCGTA GTCACGACGG TCTTAGCGTC TCATCGATAG ACCCAGTGTC ATCTTGTGTG CCAGTCGTCG GCCTTGTATC AAGAACCAGT CTCGGCGTGC501AGCCTTAGTC AGAGCCGCAA TAATCTCCTT ACCCAGCGCA ACTTGGTCGG TAAGTGCGGC CTTGTTCTGA GTGGTCTCAA TTACGGTAGC AGTACCTAAGTCGGAATCAG TCTCGGCGTT ATTAGAGGAA TGGGTCGCGT TGAACCAGCC ATTCACGCCG GAACAAGACT CACCAGAGTT AATGCCATCG TCATGGATTC601CCCTCGATGT TCTCATTATA TTTGCTTTCC ACGTTACACA GACCGGCAAT CTCAGCCAGA ACCGCACCAT CCGCAGCCAT CGCCAGAGAT TCACCCAACTGGGAGCTACA AGAGTAATAT AAACGAAAGG TGCAATGTGT CTGGCCGTTA GAGTCGGTCT TGGCGTGGTA GGCGTCGGTA GCGGTCTCTA AGTGGGTTGA 701GAGAGGTATA CTCAGAGCGA ACGTCGTAGT GGTTCATCGC GTCCTCAATA TCATAAATCA GAACGTCAGC CGTCAGGAGA CCGTCAATGG TGATTACCTT CTCTCCATAT GAGTCTCGCT TGCAGCATCA CCAAGTAGCG CAGGAGTTAT AGTATTTAGT CTTGCAGTCG GCAGTCCTCT GGCAGTTACC ACTAATGGAA 801CTCGGTGTGT TTGATGTCCT TACGTTTATC GTCGAGGTTC TCGCCCGGAG CCAGATACGC TGCCTGAGTG CGACCCAGAA CAGGGAACTG AGCGGATTTA GAGCCACACA AACTACAGGA ATGCAAATAG CAGCTCCAAG AGCGGGCCTC GGTCTATGCG ACGGACTCAC GCTGGGTCTT GTCCCTTGAC TCGCCTAAAT 901CCGCTGGAGA TGGAACGTAC CATGTGGCGA GAAGTGGTCA CGGAGGTACG AGCGAACGCA GTCAGGACTT CACCGCCAAA TACCTTCAAG AACAACGCCA GGCGACCTCT ACCTTGCATG GTACACCGCT CTTCACCAGT GCCTCCATGC TCGCTTGCGT CAGTCCTGAA GTGGCGGTTT ATGGAAGTTC TTGTTGCGGT                                                                                                         Esp3I                                                                                                        ~~~~~1001GTTTATCTCC AGCAGCAACT ACACCTTTAC CTTGGTTAGT ACCCATTTGC TGTCCACCAG TCATGCTAGC CATATGTATA TCTCCTTCTT AAAGTCGTCT CAAATAGAGG TCGTCGTTGA TGTGGAAATG GAACCAATCA TGGGTAAACG ACAGGTGGTC AGTACGATCG GTATACATAT AGAGGAAGAA TTTCAGCAGA Esp3I ~ 1101CCAGTGCCTC CACCAAGGGC CCATCGGTCT TCCCCCTGGC GCCCTGCTCC AGGAGCACCT CCGAGAGCAC AGCGGCCCTG GGCTGCCTGG TCAAGGACTA GGTCACGGAG GTGGTTCCCG GGTAGCCAGA AGGGGGACCG CGGGACGAGG TCCTCGTGGA GGCTCTCGTG TCGCCGGGAC CCGACGGACC AGTTCCTGAT 1201CTTCCCCGAA CCGGTGACGG TGTCGTGGAA CTCAGGCGCT CTGACCAGCG GCGTGCACAC CTTCCCAGCT GTCCTACAGT CCTCAGGACT CTACTCCCTC GAAGGGGCTT GGCCACTGCC ACAGCACCTT GAGTCCGCGA GACTGGTCGC CGCACGTGTG GAAGGGTCGA CAGGATGTCA GGAGTCCTGA GATGAGGGAG 1301AGCAGCGTGG TGACCGTGCC CTCCAGCAGC TTGGGCACCC AGACCTACAT CTGCAACGTG AATCACAAGC CCAGCAACAC CAAGGTGGAC AAGAAAGTTG TCGTCGCACC ACTGGCACGG GAGGTCGTCG AACCCGTGGG TCTGGATGTA GACGTTGCAC TTAGTGTTCG GGTCGTTGTG GTTCCACCTG TTCTTTCAAC 1401AGCCCAAATC TTGTGACAAA ACTCACACAT GCCCACCGTG CCCAGCACCT GAACTCCTGG GGGGACCGTC AGTCTTCCTC TTCCCCCCMA AACCCAAGGA TCGGGTTTAG AACACTGTTT TGAGTGTGTA CGGGTGGCAC GGGTCGTGGA CTTGAGGACC CCCCTGGCAG TCAGAAGGAG AAGGGGGGKT TTGGGTTCCT 1501CACCCTCATG ATCTCCCGGA CCCCTGAGGT CACATGCGTG GTGGTGGACG TGAGCCACGA AGACCCTGAG GTCAAGTTCA ACTGGTACGT GGACGGCGTG GTGGGAGTAC TAGAGGGCCT GGGGACTCCA GTGTACGCAC CACCACCTGC ACTCGGTGCT TCTGGGACTC CAGTTCAAGT TGACCATGCA CCTGCCGCAC 1601GAGGTGCATA ATGCCAAGAC AAAGCCGCGG GAGGAGCAGT ACAACAGCAC GTACCGTGTG GTCAGCGTCC TCACCGTCCT GCACCAGGAC TGGCTGAATG CTCCACGTAT TACGGTTCTG TTTCGGCGCC CTCCTCGTCA TGTTGTCGTG CATGGCACAC CAGTCGCAGG AGTGGCAGGA CGTGGTCCTG ACCGACTTAC 1701GCAAGGAGTA CAAGTGCAAG GTCTCCAACA AAGCCCTCCC AGCCCCCATC GAGAAAACCA TCTCCAAAGC CAAAGGGCAG CCCCGAGAAC CACAGGTGTA CGTTCCTCAT GTTCACGTTC CAGAGGTTGT TTCGGGAGGG TCGGGGGTAG CTCTTTTGGT AGAGGTTTCG GTTTCCCGTC GGGGCTCTTG GTGTCCACAT 1801CACCCTGCCC CCATCCCGGG ATGAGCTGAC CAAGAACCAG GTCAGCCTGA CCTGCCTGGT CAAAGGCTTC TACCCCAGCG ACATCGCCGT GGAGTGGGAG GTGGGACGGG GGTAGGGCCC TACTCGACTG GTTCTTGGTC CAGTCGGACT GGACGGACCA GTTTCCGAAG ATGGGGTCGC TGTAGCGGCA CCTCACCCTC 1901AGCAATGGGC AGCCGGAGAA CAACTACAAG ACCACGCCTC CCATGCTGGA CTCCGACGGC TCCTTCTTCC TCTACAGCAA GCTCACCGTG GACAAGAGCA TCGTTACCCG TCGGCCTCTT GTTGATGTTC TGGTGCGGAG GGTACGACCT GAGGCTGCCG AGGAAGAAGG AGATGTCGTT CGAGTGGCAC CTGTTCTCGT 2001GGTGGCAGCA GGGGAACGTC TTCTCATGCT CCGTGATGCA TGAGGCTCTG CACAACCACT ACACGCAGAA GAGCCTCTCC CTGTCTCCGG GTAAAGGGTA CCACCGTCGT CCCCTTGCAG AAGAGTACGA GGCACTACGT ACTCCGAGAC GTGTTGGTGA TGTGCGTCTT CTCGGAGAGG GACAGAGGCC CATTTCCCAT 2101CATGTCCCAT ATGCTCGACA TGGCAAGCAG CCTGAGACAG ATTCTGGACT CCCAGAAAAT GGAGTGGAGG TCCAACGCCG GGGGCAGCGG TAGGGATAAG GTACAGGGTA TACGAGCTGT ACCGTTCGTC GGACTCTGTC TAAGACCTGA GGGTCTTTTA CCTCACCTCC AGGTTGCGGC CCCCGTCGCC ATCCCTATTC 2201TGGTCAGATC TGGTACCGCG GGCGGCGACC AGCAGCATGA GCGTGGAATT TTATAACAGC AACAAAAGCG CGCAGACCAA CAGCATTACC CCGATTATTA ACCAGTCTAG ACCATGGCGC CCGCCGCTGG TCGTCGTACT CGCACCTTAA AATATTGTCG TTGTTTTCGC GCGTCTGGTT GTCGTAATGG GGCTAATAAT 2301AAATTACCAA CACCAGCGAT AGCGATCTGA ACCTGAACGA TGTGAAAGTG CGCTATTATT ATACCAGCGA TGGCACCCAG GGCCAGACCT TTTGGTGCGA TTTAATGGTT GTGGTCGCTA TCGCTAGACT TGGACTTGCT ACACTTTCAC GCGATAATAA TATGGTCGCT ACCGTGGGTC CCGGTCTGGA AAACCACGCT 2401TCATGCGGGC GCGCTGCTGG GCAACAGCTA TGTGGATAAC ACCAGCAAAG TGACCGCGAA CTTTGTGAAA GAAACCGCGA GCCCGACCAG CACCTATGAT AGTACGCCCG CGCGACGACC CGTTGTCGAT ACACCTATTG TGGTCGTTTC ACTGGCGCTT GAAACACTTT CTTTGGCGCT CGGGCTGGTC GTGGATACTA 2501ACCTATGTGG AATTTGGCTT TGCGAGTGGC CGCGCGACCC TGAAAAAAGG CCAGTTTATT ACCATTCAGG GCCGCATTAC CAAAAGCGAT TGGAGCAACT TGGATACACC TTAAACCGAA ACGCTCACCG GCGCGCTGGG ACTTTTTTCC GGTCAAATAA TGGTAAGTCC CGGCGTAATG GTTTTCGCTA ACCTCGTTGA 2601ATACCCAGAC CAACGATTAT AGCTTTGATG CGAGCAGCAG CACCCCGGTG GTGAACCCGA AAGTGACCGG CTATATTGGC GGCGCGAAAG TGCTGGGCAC TATGGGTCTG GTTGCTAATA TCGAAACTAC GCTCGTCGTC GTGGGGCCAC CACTTGGGCT TTCACTGGCC GATATAACCG CCGCGCTTTC ACGACCCGTG 2701CGCGCCGTAA AGCGGCCGCA ATTTAATTCC GGTTATTTTC CACCATATTG CCGTCTTTTG GCAATGTGAG GGCCCGGAAA CCTGGCCCTG TCTTCTTGAC GCGCGGCATT TCGCCGGCGT TAAATTAAGG CCAATAAAAG GTGGTATAAC GGCAGAAAAC CGTTACACTC CCGGGCCTTT GGACCGGGAC AGAAGAACTG 2801GAGCATTCCT AGGGGTCTTT CCCCTCTCGC CAAAGGAATG CAAGGTCTGT TGAATGTCGT GAAGGAAGCA GTTCCTCTGG AAGCTTCTTG AAGACAAACA CTCGTAAGGA TCCCCAGAAA GGGGAGAGCG GTTTCCTTAC GTTCCAGACA ACTTACAGCA CTTCCTTCGT CAAGGAGACC TTCGAAGAAC TTCTGTTTGT 2901ACGTCTGTAG CGACCCTTTG CAGGCAGCGG AACCCCCCAC CTGGCGACAG GTGCCTCTGC GGCCAAAAGC CACGTGTATA AGATACACCT GCAAAGGCGG TGCAGACATC GCTGGGAAAC GTCCGTCGCC TTGGGGGGTG GACCGCTGTC CACGGAGACG CCGGTTTTCG GTGCACATAT TCTATGTGGA CGTTTCCGCC 3001CACAACCCCA GTGCCACGTT GTGAGTTGGA TAGTTGTGGA AAGAGTCAAA TGGCTCACCT CAAGCGTATT CAACAAGGGG CTGAAGGATG CCCAGAAGGT GTGTTGGGGT CACGGTGCAA CACTCAACCT ATCAACACCT TTCTCAGTTT ACCGAGTGGA GTTCGCATAA GTTGTTCCCC GACTTCCTAC GGGTCTTCCA 3101ACCCCATTGT ATGGGATCTG ATCTGGGGCC TCGGTGCACA TGCTTTACAT GTGTTTAGTC GAGGTTAAAA AACGTCTAGG CCCCCCGAAC CACGGGGACG TGGGGTAACA TACCCTAGAC TAGACCCCGG AGCCACGTGT ACGAAATGTA CACAAATCAG CTCCAATTTT TTGCAGATCC GGGGGGCTTG GTGCCCCTGC 3201TGGTTTTCCT TTGAAAAACA CGATGATAAT ATGGCCACCA CCCATACCTA GGCTTTTGCA AAGATCGATC AAGAGACAGG ATGAGGATCG TTTCGCATGA ACCAAAAGGA AACTTTTTGT GCTACTATTA TACCGGTGGT GGGTATGGAT CCGAAAACGT TTCTAGCTAG TTCTCTGTCC TACTCCTAGC AAAGCGTACT 3301TTGAACAAGA TGGATTGCAC GCAGGTTCTC CGGCCGCTTG GGTGGAGAGG CTATTCGGCT ATGACTGGGC ACAACAGACA ATCGGCTGCT CTGATGCCGC AACTTGTTCT ACCTAACGTG CGTCCAAGAG GCCGGCGAAC CCACCTCTCC GATAAGCCGA TACTGACCCG TGTTGTCTGT TAGCCGACGA GACTACGGCG 3401CGTGTTCCGG CTGTCAGCGC AGGGGCGCCC GGTTCTTTTT GTCAAGACCG ACCTGTCCGG TGCCCTGAAT GAACTGCAAG ACGAGGCAGC GCGGCTATCGGCACAAGGCC GACAGTCGCG TCCCCGCGGG CCAAGAAAAA CAGTTCTGGC TGGACAGGCC ACGGGACTTA CTTGACGTTC TGCTCCGTCG CGCCGATAGC3501TGGCTGGCCA CGACGGGCGT TCCTTGCGCA GCTGTGCTCG ACGTTGTCAC TGAAGCGGGA AGGGACTGGC TGCTATTGGG CGAAGTGCCG GGGCAGGATCACCGACCGGT GCTGCCCGCA AGGAACGCGT CGACACGAGC TGCAACAGTG ACTTCGCCCT TCCCTGACCG ACGATAACCC GCTTCACGGC CCCGTCCTAG3601TCCTGTCATC TCACCTTGCT CCTGCCGAGA AAGTATCCAT CATGGCTGAT GCAATGCGGC GGCTGCATAC GCTTGATCCG GCTACCTGCC CATTCGACCAAGGACAGTAG AGTGGAACGA GGACGGCTCT TTCATAGGTA GTACCGACTA CGTTACGCCG CCGACGTATG CGAACTAGGC CGATGGACGG GTAAGCTGGT3701CCAAGCGAAA CATCGCATCG AGCGAGCACG TACTCGGATG GAAGCCGGTC TTGTCGATCA GGATGATCTG GACGAAGAGC ATCAGGGGCT CGCGCCAGCCGGTTCGCTTT GTAGCGTAGC TCGCTCGTGC ATGAGCCTAC CTTCGGCCAG AACAGCTAGT CCTACTAGAC CTGCTTCTCG TAGTCCCCGA GCGCGGTCGG3801GAACTGTTCG CCAGGCTCAA GGCGAGCATG CCCGACGGCG AGGATCTCGT CGTGACCCAT GGCGATGCCT GCTTGCCGAA TATCATGGTG GAAAATGGCCCTTGACAAGC GGTCCGAGTT CCGCTCGTAC GGGCTGCCGC TCCTAGAGCA GCACTGGGTA CCGCTACGGA CGAACGGCTT ATAGTACCAC CTTTTACCGG3901GCTTTTCTGG ATTCATCGAC TGTGGCCGGC TGGGTGTGGC GGACCGCTAT CAGGACATAG CGTTGGCTAC CCGTGATATT GCTGAAGAGC TTGGCGGCGACGAAAAGACC TAAGTAGCTG ACACCGGCCG ACCCACACCG CCTGGCGATA GTCCTGTATC GCAACCGATG GGCACTATAA CGACTTCTCG AACCGCCGCT4001ATGGGCTGAC CGCTTCCTCG TGCTTTACGG TATCGCCGCT CCCGATTCGC AGCGCATCGC CTTCTATCGC CTTCTTGACG AGTTCTTCTG AGCGGGACTCTACCCGACTG GCGAAGGAGC ACGAAATGCC ATAGCGGCGA GGGCTAAGCG TCGCGTAGCG GAAGATAGCG GAAGAACTGC TCAAGAAGAC TCGCCCTGAG4101TGGGGTTCGG GCCGCACTCG AGCATAAACT TGTTTATTGC AGCTTATAAT GGTTACAAAT AAAGCAATAG CATCACAAAT TTCACAAATA AAGCATTTTTACCCCAAGCC CGGCGTGAGC TCGTATTTGA ACAAATAACG TCGAATATTA CCAATGTTTA TTTCGTTATC GTAGTGTTTA AAGTGTTTAT TTCGTAAAAA                                                             I-SceI                                                       ~~~~~~~~~~~~~~~~~~~4201TTCACTGCAT TCTAGTTGTG GTTTGTCCAA ACTCATCAAT GTATCTTAAG TAGGGATAAC AGGGTAATTT TGTTAAATCA GCTCATTTTT TAACCAATAGAAGTGACGTA AGATCAACAC CAAACAGGTT TGAGTAGTTA CATAGAATTC ATCCCTATTG TCCCATTAAA ACAATTTAGT CGAGTAAAAA ATTGGTTATC4301GAACGCCATC AAAAATAATT CGCGTCTGGC CTTCCTGTAG CCAGCTTTCA TCAACATTAA ATGTGAGCGA GTAACAACCC GTCGGATTCT CCGTGGGAACCTTGCGGTAG TTTTTATTAA GCGCAGACCG GAAGGACATC GGTCGAAAGT AGTTGTAATT TACACTCGCT CATTGTTGGG CAGCCTAAGA GGCACCCTTG4401AAACGGCGGA TTGACCGTAA TGGGATAGGT TACGTTGGTG TAGATGGGCG CATCGTAACC GTGCATCTGC CAGTTTGAGG GGACGACGAC CGTATCGGCC TTTGCCGCCT AACTGGCATT ACCCTATCCA ATGCAACCAC ATCTACCCGC GTAGCATTGG CACGTAGACG GTCAAACTCC CCTGCTGCTG GCATAGCCGG 4501TCAGGAAGAT CGCACTCCAG CCAGCTTTCC GGCACCGCTT CTGGTGCCGG AAACCAGGCA AAGCGCCATT CGCCATTCAG GCTGCGCAAC TGTTGGGAAG AGTCCTTCTA GCGTGAGGTC GGTCGAAAGG CCGTGGCGAA GACCACGGCC TTTGGTCCGT TTCGCGGTAA GCGGTAAGTC CGACGCGTTG ACAACCCTTC 4601GGCGATCGGT GCGGGCCTCT TCGCTATTAC GCCAGCTGGC GAAAGGGGGA TGTGCTGCAA GGCGATTAAG TTGGGTAACG CCAGGGTTTT CCCAGTCACG CCGCTAGCCA CGCCCGGAGA AGCGATAATG CGGTCGACCG CTTTCCCCCT ACACGACGTT CCGCTAATTC AACCCATTGC GGTCCCAAAA GGGTCAGTGC 4701ACGTTGTAAA ACGACGGCCA GTGAATTGCA ATTCGTAATC ATGGTCATAG CTGTTTCCTG TGTGAAATTG TTATCCGCTC ACAATTCCAC ACAACATACG TGCAACATTT TGCTGCCGGT CACTTAACGT TAAGCATTAG TACCAGTATC GACAAAGGAC ACACTTTAAC AATAGGCGAG TGTTAAGGTG TGTTGTATGC                                                                                            I-SceI                                                                                     ~~~~~~~~~~~~~~~~~~~~4801AGCCGGAAGC ATAAAGTGTA AAGCCTGGGG TGCCTAATGA GTGAGCTAAC TCACATTAAT TGCGTTGCGC TCACTGCCAT TACCCTGTTA TCCCTAGTGA TCGGCCTTCG TATTTCACAT TTCGGACCCC ACGGATTACT CACTCGATTG AGTGTAATTA ACGCAACGCG AGTGACGGTA ATGGGACAAT AGGGATCACT 4901ACCATCACCC TAATCAAGTT TTTTGGGGTC GAGGTGCCGT AAAGCACTAA ATCGGAACCC TAAAGGGAGC CCCCGATTTA GAGCTTGACG GGGAAAGCCG TGGTAGTGGG ATTAGTTCAA AAAACCCCAG CTCCACGGCA TTTCGTGATT TAGCCTTGGG ATTTCCCTCG GGGGCTAAAT CTCGAACTGC CCCTTTCGGC 5001GCGAACGTGG CGAGAAAGGA AGGGAAGAAA GCGAAAGGAG CGGGCGCTAG GGCGCTGGCA AGTGTAGCGG TCACGCTGCG CGTAACCACC ACACCCGCCG CGCTTGCACC GCTCTTTCCT TCCCTTCTTT CGCTTTCCTC GCCCGCGATC CCGCGACCGT TCACATCGCC AGTGCGACGC GCATTGGTGG TGTGGGCGGC 5101CGCTTAATGC GCCGCTACAG GGCGCGTCAG GTGGCACTTT TCGGGGAAAT GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT CAAATATGTA GCGAATTACG CGGCGATGTC CCGCGCAGTC CACCGTGAAA AGCCCCTTTA CACGCGCCTT GGGGATAAAC AAATAAAAAG ATTTATGTAA GTTTATACAT 5201TCCGCTCATG AGACAATAAC CCTGATAAAT GCTTCAATAA TAACGACCGG TAATGAAAAA GGAAGAGTAT GAGTATTCAA CATTTCCGTG TCGCCCTTAT AGGCGAGTAC TCTGTTATTG GGACTATTTA CGAAGTTATT ATTGCTGGCC ATTACTTTTT CCTTCTCATA CTCATAAGTT GTAAAGGCAC AGCGGGAATA 5301TCCCTTTTTT GCGGCATTTT GCCTTCCTGT TTTTGCTCAC CCAGAAACGC TGGTGAAAGT AAAAGATGCT GAAGATCAGT TGGGTGCACG AGTGGGTTAC AGGGAAAAAA CGCCGTAAAA CGGAAGGACA AAAACGAGTG GGTCTTTGCG ACCACTTTCA TTTTCTACGA CTTCTAGTCA ACCCACGTGC TCACCCAATG 5401ATCGAACTGG ATCTCAACAG CGGTAAGATC CTTGAGAGTT TTCGCCCCGA AGAACGTTTT CCAATGATGA GCACTTTTAA AGTTCTGCTA TGTGGCGCGG TAGCTTGACC TAGAGTTGTC GCCATTCTAG GAACTCTCAA AAGCGGGGCT TCTTGCAAAA GGTTACTACT CGTGAAAATT TCAAGACGAT ACACCGCGCC 5501TATTATCCCG TATTGACGCC GGGCAAGAGC AACTCGGTCG CCGCATACAC TATTCTCAGA ATGACTTGGT TGAGTCTAGC GTTGATCGGC ACGTAAGAGG ATAATAGGGC ATAACTGCGG CCCGTTCTCG TTGAGCCAGC GGCGTATGTG ATAAGAGTCT TACTGAACCA ACTCAGATCG CAACTAGCCG TGCATTCTCC 5601TTCCAACTTT CACCATAATG AAATAAGATC ACTACCGGGC GTATTTTTTG AGTTATCGAG ATTTTCAGGA GCTAAGGAAG CTAAAATGGA GAAAAAAATC AAGGTTGAAA GTGGTATTAC TTTATTCTAG TGATGGCCCG CATAAAAAAC TCAATAGCTC TAAAAGTCCT CGATTCCTTC GATTTTACCT CTTTTTTTAG 5701ACTGGATATA CCACCGTTGA TATATCCCAA TGGCATCGTA AAGAACATTT TGAGGCATTT CAGTCAGTTG CTCAATGTAC CTATAACCAG ACCGTTCAGC TGACCTATAT GGTGGCAACT ATATAGGGTT ACCGTAGCAT TTCTTGTAAA ACTCCGTAAA GTCAGTCAAC GAGTTACATG GATATTGGTC TGGCAAGTCG 5801TGGATATTAC GGCCTTTTTA AAGACCGTAA AGAAAAATAA GCACAAGTTT TATCCGGCCT TTATTCACAT TCTTGCCCGC CTGATGAATG CTCATCCGGA ACCTATAATG CCGGAAAAAT TTCTGGCATT TCTTTTTATT CGTGTTCAAA ATAGGCCGGA AATAAGTGTA AGAACGGGCG GACTACTTAC GAGTAGGCCT 5901ATTCCGTATG GCAATGAAAG ACGGTGAGCT GGTGATATGG GATAGTGTTC ACCCTTGTTA CACCGTTTTC CATGAGCAAA CTGAAACGTT TTCATCGCTC TAAGGCATAC CGTTACTTTC TGCCACTCGA CCACTATACC CTATCACAAG TGGGAACAAT GTGGCAAAAG GTACTCGTTT GACTTTGCAA AAGTAGCGAG 6001TGGAGTGAAT ACCACGACGA TTTCCGGCAG TTTCTACACA TATATTCGCA AGATGTGGCG TGTTACGGTG AAAACCTGGC CTATTTCCCT AAAGGGTTTA ACCTCACTTA TGGTGCTGCT AAAGGCCGTC AAAGATGTGT ATATAAGCGT TCTACACCGC ACAATGCCAC TTTTGGACCG GATAAAGGGA TTTCCCAAAT 6101TTGAGAATAT GTTTTTCGTA TCAGCCAATC CCTGGGTGAG TTTCACCAGT TTTGATTTAA ACGTGGCCAA TATGGACAAC TTCTTCGCCC CCGTTTTCAC AACTCTTATA CAAAAAGCAT AGTCGGTTAG GGACCCACTC AAAGTGGTCA AAACTAAATT TGCACCGGTT ATACCTGTTG AAGAAGCGGG GGCAAAAGTG 6201CATGGGCAAA TATTATACGC AAGGCGACAA GGTGCTGATG CCGCTGGCGA TTCAGGTTCA TCATGCCGTC TGTGATGGCT TCCATGTCGG CAGAATGCTT GTACCCGTTT ATAATATGCG TTCCGCTGTT CCACGACTAC GGCGACCGCT AAGTCCAAGT AGTACGGCAG ACACTACCGA AGGTACAGCC GTCTTACGAA 6301AATGAATTAC AACAGTACTG CGATGAGTGG CAGGGCGGGG CGTAATTTTT TTAAGGCAGT TATTGGTGCC CTTAAACGCC TGGTGCTACG CCTGAATAAG TTACTTAATG TTGTCATGAC GCTACTCACC GTCCCGCCCC GCATTAAAAA AATTCCGTCA ATAACCACGG GAATTTGCGG ACCACGATGC GGACTTATTC 6401TGATAATAAG CGGATGAATG GCAGAAATTC GAAATGACCG ACCAAGCGAC GCCCAACCTG CCATCACGAG ATTTCGATTC CACCGCCGCC TTCTATGAAA ACTATTATTC GCCTACTTAC CGTCTTTAAG CTTTACTGGC TGGTTCGCTG CGGGTTGGAC GGTAGTGCTC TAAAGCTAAG GTGGCGGCGG AAGATACTTT 6501GGTTGGGCTT CGGAATCGTT TTCCGGGACG CCGGCTGGAT GATCCTCCAG CGCGGGGATC TCATGCTGGA GTTCTTCGCC CACCCTAGGG GGAGGCTAAC CCAACCCGAA GCCTTAGCAA AAGGCCCTGC GGCCGACCTA CTAGGAGGTC GCGCCCCTAG AGTACGACCT CAAGAAGCGG GTGGGATCCC CCTCCGATTG 6601TGAAACACGG AAGGAGACAA TACCGGAAGG AACCCGCGCT ATGACGGCAA TAAAAAGACA GAATAAAACG CACGGTGTTG GGTCGTTTGT TCATAAACGC ACTTTGTGCC TTCCTCTGTT ATGGCCTTCC TTGGGCGCGA TACTGCCGTT ATTTTTCTGT CTTATTTTGC GTGCCACAAC CCAGCAAACA AGTATTTGCG 6701GGGGTTCGGT CCCAGGGCTG GCACTCTGTC GATACCCCAC CGAGACCCCA TTGGGGCCAA TACGCCCGCG TTTCTTCCTT TTCCCCACCC CACCCCCCAA CCCCAAGCCA GGGTCCCGAC CGTGAGACAG CTATGGGGTG GCTCTGGGGT AACCCCGGTT ATGCGGGCGC AAAGAAGGAA AAGGGGTGGG GTGGGGGGTT 6801GTTCGGGTGA AGGCCCAGGG CTCGCAGCCA ACGTCGGGGC GGCAGGCCCT GCCATAGCCT CAGGTTACTC ATATATACTT TAGATTGATT TAAAACTTCACAAGCCCACT TCCGGGTCCC GAGCGTCGGT TGCAGCCCCG CCGTCCGGGA CGGTATCGGA GTCCAATGAG TATATATGAA ATCTAACTAA ATTTTGAAGT6901TTTTTAATTT AAAAGGATCT AGGTGAAGAT CCTTTTTGAT AATCTCATGA CCAAAATCCC TTAACGTGAG TTTTCGTTCC ACTGAGCGTC AGACCCCGTAAAAAATTAAA TTTTCCTAGA TCCACTTCTA GGAAAAACTA TTAGAGTACT GGTTTTAGGG AATTGCACTC AAAAGCAAGG TGACTCGCAG TCTGGGGCAT7001GAAAAGATCA AAGGATCTTC TTGAGATCCT TTTTTTCTGC GCGTAATCTG CTGCTTGCAA ACAAAAAAAC CACCGCTACC AGCGGTGGTT TGTTTGCCGGCTTTTCTAGT TTCCTAGAAG AACTCTAGGA AAAAAAGACG CGCATTAGAC GACGAACGTT TGTTTTTTTG GTGGCGATGG TCGCCACCAA ACAAACGGCC7101ATCAAGAGCT ACCAACTCTT TTTCCGAAGG TAACTGGCTT CAGCAGAGCG CAGATACCAA ATACTGTCCT TCTAGTGTAG CCGTAGTTAG GCCACCACTTTAGTTCTCGA TGGTTGAGAA AAAGGCTTCC ATTGACCGAA GTCGTCTCGC GTCTATGGTT TATGACAGGA AGATCACATC GGCATCAATC CGGTGGTGAA7201CAAGAACTCT GTAGCACCGC CTACATACCT CGCTCTGCTA ATCCTGTTAC CAGTGGCTGC TGCCAGTGGC GATAAGTCGT GTCTTACCGG GTTGGACTCAGTTCTTGAGA CATCGTGGCG GATGTATGGA GCGAGACGAT TAGGACAATG GTCACCGACG ACGGTCACCG CTATTCAGCA CAGAATGGCC CAACCTGAGT7301AGACGATAGT TACCGGATAA GGCGCAGCGG TCGGGCTGAA CGGGGGGTTC GTGCACACAG CCCAGCTTGG AGCGAACGAC CTACACCGAA CTGAGATACCTCTGCTATCA ATGGCCTATT CCGCGTCGCC AGCCCGACTT GCCCCCCAAG CACGTGTGTC GGGTCGAACC TCGCTTGCTG GATGTGGCTT GACTCTATGG7401TACAGCGTGA GCTATGAGAA AGCGCCACGC TTCCCGAAGG GAGAAAGGCG GACAGGTATC CGGTAAGCGG CAGGGTCGGA ACAGGAGAGC GCACGAGGGAATGTCGCACT CGATACTCTT TCGCGGTGCG AAGGGCTTCC CTCTTTCCGC CTGTCCATAG GCCATTCGCC GTCCCAGCCT TGTCCTCTCG CGTGCTCCCT7501GCTTCCAGGG GGAAACGCCT GGTATCTTTA TAGTCCTGTC GGGTTTCGCC ACCTCTGACT TGAGCGTCGA TTTTTGTGAT GCTCGTCAGG GGGGCGGAGCCGAAGGTCCC CCTTTGCGGA CCATAGAAAT ATCAGGACAG CCCAAAGCGG TGGAGACTGA ACTCGCAGCT AAAAACACTA CGAGCAGTCC CCCCGCCTCG7601CTATGGAAAA ACGCCAGCAA CGCGGCCTTT TTACGGTTCC TGGCCTTTTG CTGGCCTTTT GCTCACATGT TCTTTCCTGC GTTATCCCCT GATTCTGTGGGATACCTTTT TGCGGTCGTT GCGCCGGAAA AATGCCAAGG ACCGGAAAAC GACCGGAAAA CGAGTGTACA AGAAAGGACG CAATAGGGGA CTAAGACACC7701ATAACCGTAT TACCGCCATG CATTAGTTAT TAATAGTAAT CAATTACGGG GTCATTAGTT CATAGCCCAT ATATGGAGTT CCGCGTTACA TAACTTACGGTATTGGCATA ATGGCGGTAC GTAATCAATA ATTATCATTA GTTAATGCCC CAGTAATCAA GTATCGGGTA TATACCTCAA GGCGCAATGT ATTGAATGCC7801TAAATGGCCC GCCTGGCTGA CCGCCCAACG ACCCCCGCCC ATTGACGTCA ATAATGACGT ATGTTCCCAT AGTAACGCCA ATAGGGACTT TCCATTGACGATTTACCGGG CGGACCGACT GGCGGGTTGC TGGGGGCGGG TAACTGCAGT TATTACTGCA TACAAGGGTA TCATTGCGGT TATCCCTGAA AGGTAACTGC7901TCAATGGGTG GAGTATTTAC GGTAAACTGC CCACTTGGCA GTACATCAAG TGTATCATAT GCCAAGTACG CCCCCTATTG ACGTCAATGA CGGTAAATGGAGTTACCCAC CTCATAAATG CCATTTGACG GGTGAACCGT CATGTAGTTC ACATAGTATA CGGTTCATGC GGGGGATAAC TGCAGTTACT GCCATTTACC8001CCCGCCTGGC ATTATGCCCA GTACATGACC TTATGGGACT TTCCTACTTG GCAGTACATC TACGTATTAG TCATCGCTAT TACCATGGTG ATGCGGTTTTGGGCGGACCG TAATACGGGT CATGTACTGG AATACCCTGA AAGGATGAAC CGTCATGTAG ATGCATAATC AGTAGCGATA ATGGTACCAC TACGCCAAAA8101GGCAGTACAT CAATGGGCGT GGATAGCGGT TTGACTCACG GGGATTTCCA AGTCTCCACC CCATTGACGT CAATGGGAGT TTGTTTTGGC ACCAAAATCACCGTCATGTA GTTACCCGCA CCTATCGCCA AACTGAGTGC CCCTAAAGGT TCAGAGGTGG GGTAACTGCA GTTACCCTCA AACAAAACCG TGGTTTTAGT8201ACGGGACTTT CCAAAATGTC GTAACAACTC CGCCCCATTG ACGCAAATGG GCGGTAGGCG TGTACGGTGG GAGGTCTATA TAAGCAGAGC TTGCCCTGAAA GGTTTTACAG CATTGTTGAG GCGGGGTAAC TGCGTTTACC CGCCATCCGC ACATGCCACC CTCCAGATAT ATTCGTCTCG A

APPENDIX 2 Sequences of cloned light chains.   16   (1)

6 (1)

22 (1)

1 (1)

21 (1)

24 (1)----------------------------------------------------------------------33 (1)

33-35 (1)

41 (1)

7 (1)

7-7 (1)

41-40 (1)

8 (1)

4 (1)

9 (1)

31 (1)

17 (1)

  16   (64)

6 (64)

22 (64)

1 (71)

21 (71)

24 (1)

33 (66)

33-35 (66)

41 (66)

7 (66)

7-7 (66)

41-40 (66)

8 (65)

4 (64)

9 (65)

31 (65)

17 (66)

  16   (133)

6 (134)

22 (134)

1 (141)

21 (140)

24 (24)

33 (135)

33-35 (135)

41 (135)

7 (135)

7-7 (135)

41-40 (135)

8 (134)

4 (132)

9 (134)

31 (135)

17 (135)

  16   (203)

6 (204)

22 (204)

1 (211)

21 (210)

24 (94)

33 (205)

33-35 (205)

41 (205)

7 (205)

7-7 (205)

41-40 (205)

8 (204)

4 (179) -------------------------------------------------- 9 (204)

31 (205)

17 (205)

APPENDIX 3Alignment of sequences of cloned variable domains of heavy chains   14  (1)

15 (1)

1 (1)

21 (1)

33 (1)

41 (1)

6 (1)

7 (1)

8 (1)

9 (1)

32 (1)

31 (1)

  14   (69)

15 (69)

1 (71)

21 (69)

33 (69)

41 (69)

6 (69)

7 (69)

8 (69)

9 (69)

32 (69)

31 (69)

  14   (137)

15 (136)

1 (128)

21 (135)

33 (129)

41 (130)

6 (137)

7 (127)

8 (129)

9 (133)

32 (133)

31 (133)

  14   (207)

15 (206)

1 (198)

21 (148) ----------- 33 (142) ----------- 41 (143) ----------- 6 (207)

7 (197)

8 (199)

9 (203)

32 (203)

31 (203)

APPENDIX 4Sequences of plasmids encoding spAG-MLuc and spAG-ΔN-MLuc hybrids.pETspAG-ΔN-MLuc1    1GGAAAAATGC CTGGCAAAAA ACTGCCACTG GCAGTTATCA TGGAAATGGA AGCCAATGCT TTCAAAGCTG GCTGCACCAGCCTTTTTACG GACCGTTTTT TGACGGTGAC CGTCAATAGT ACCTTTACCT TCGGTTACGA AAGTTTCGAC CGACGTGGTCGGGATGCCTT ATCTGTCTTT CCCTACGGAA TAGACAGAAA  101CAAAAATTAA GTGTACAGCC AAAATGAAGG TATACATTCC AGGAAGGTGT CACGATTATG GTGGTGACAA GAAAACTGGAGTTTTTAATT CACATGTCGG TTTTACTTCC ATATGTAAGG TCCTTCCACA GTGCTAATAC CACCACTGTT CTTTTGACCTCAGGCAGGAA TTGTTGGTGC GTCCGTCCTT AACAACCACG  201AATTGTTGAC ATTCCCGAAA TCTCTGGATT TAAGGAGATG GCACCCATGG AACAGTTCAT TGCTCAAGTT GATCGCTGCGTTAACAACTG TAAGGGCTTT AGAGACCTAA ATTCCTCTAC CGTGGGTACC TTGTCAAGTA ACGAGTTCAA CTAGCGACGCCTTCCTGCAC TACTGGATGT GAAGGACGTG ATGACCTACA  301CTCAAAGGTC TTGCCAATGT TAAGTGCTCT GAACTCCTGA AGAAATGGCT GCCTGACAGG TGTGCAAGTT TTGCTGACAAGAGTTTCCAG AACGGTTACA ATTCACGAGA CTTGAGGACT TCTTTACCGA CGGACTGTCC ACACGTTCAA AACGACTGTTGATTCAAAAA GAAGTTCACA CTAAGTTTTT CTTCAAGTGT  401ATATCAAAGG CATGGCCGTA CAGCTGCAGG TCGAGCACCA CCACCACCAC CACTGAGATC CGGCTGCTAA CAAAGCCCGATATAGTTTCC GTACCGGCAT GTCGACGTCC AGCTCGTGGT GGTGGTGGTG GTGACTCTAG GCCGACGATT GTTTCGGGCTAAGGAAGCTG AGTTGGCTGC TTCCTTCGAC TCAACCGACG  501TGCCACCGCT GAGCAATAAC TAGCATAACC CCTTGGGGCC TCTAAACGGG TCTTGAGGGG TTTTTTGCTG AAAGGAGGAAACGGTGGCGA CTCGTTATTG ATCGTATTGG GGAACCCCGG AGATTTGCCC AGAACTCCCC AAAAAACGAC TTTCCTCCTTCTATATCCGG ATTGGCGAAT GATATAGGCC TAACCGCTTA  601GGGACGCGCC CTGTAGCGGC GCATTAAGCG CGGCGGGTGT GGTGGTTACG CGCAGCGTGA CCGCTACACT TGCCAGCGCCCCCTGCGCGG GACATCGCCG CGTAATTCGC GCCGCCCACA CCACCAATGC GCGTCGCACT GGCGATGTGA ACGGTCGCGGCTAGCGCCCG CTCCTTTCGC GATCGCGGGC GAGGAAAGCG  701TTTCTTCCCT TCCTTTCTCG CCACGTTCGC CGGCTTTCCC CGTCAAGCTC TAAATCGGGG GCTCCCTTTA GGGTTCCGATAAAGAAGGGA AGGAAAGAGC GGTGCAAGCG GCCGAAAGGG GCAGTTCGAG ATTTAGCCCC CGAGGGAAAT CCCAAGGCTATTAGTGCTTT ACGGCACCTC AATCACGAAA TGCCGTGGAG  801GACCCCAAAA AACTTGATTA GGGTGATGGT TCACGTAGTG GGCCATCGCC CTGATAGACG GTTTTTCGCC CTTTGACGTTCTGGGGTTTT TTGAACTAAT CCCACTACCA AGTGCATCAC CCGGTAGCGG GACTATCTGC CAAAAAGCGG GAAACTGCAAGGAGTCCACG TTCTTTAATA CCTCAGGTGC AAGAAATTAT  901GTGGACTCTT GTTCCAAACT GGAACAACAC TCAACCCTAT CTCGGTCTAT TCTTTTGATT TATAAGGGAT TTTGCCGATTCACCTGAGAA CAAGGTTTGA CCTTGTTGTG AGTTGGGATA GAGCCAGATA AGAAAACTAA ATATTCCCTA AAACGGCTAATCGGCCTATT GGTTAAAAAA AGCCGGATAA CCAATTTTTT 1001TGAGCTGATT TAACAAAAAT TTAACGCGAA TTTTAACAAA ATATTAACGT TTACAATTTC AGGTGGCACT TTTCGGGGAA ACTCGACTAA ATTGTTTTTA AATTGCGCTT AAAATTGTTT TATAATTGCA AATGTTAAAG TCCACCGTGA AAAGCCCCTT ATGTGCGCGG AACCCCTATT TACACGCGCC TTGGGGATAA 1101TGTTTATTTT TCTAAATACA TTCAAATATG TATCCGCTCA TGAATTAATT CTTAGAAAAA CTCATCGAGC ATCAAATGAA ACAAATAAAA AGATTTATGT AAGTTTATAC ATAGGCGAGT ACTTAATTAA GAATCTTTTT GAGTAGCTCG TAGTTTACTT ACTGCAATTT ATTCATATCA TGACGTTAAA TAAGTATAGT 1201GGATTATCAA TACCATATTT TTGAAAAAGC CGTTTCTGTA ATGAAGGAGA AAACTCACCG AGGCAGTTCC ATAGGATGGC CCTAATAGTT ATGGTATAAA AACTTTTTCG GCAAAGACAT TACTTCCTCT TTTGAGTGGC TCCGTCAAGG TATCCTACCG AAGATCCTGG TATCGGTCTG TTCTAGGACC ATAGCCAGAC 1301CGATTCCGAC TCGTCCAACA TCAATACAAC CTATTAATTT CCCCTCGTCA AAAATAAGGT TATCAAGTGA GAAATCACCA GCTAAGGCTG AGCAGGTTGT AGTTATGTTG GATAATTAAA GGGGAGCAGT TTTTATTCCA ATAGTTCACT CTTTAGTGGT TGAGTGACGA CTGAATCCGG ACTCACTGCT GACTTAGGCC 1401TGAGAATGGC AAAAGTTTAT GCATTTCTTT CCAGACTTGT TCAACAGGCC AGCCATTACG CTCGTCATCA AAATCACTCG ACTCTTACCG TTTTCAAATA CGTAAAGAAA GGTCTGAACA AGTTGTCCGG TCGGTAATGC GAGCAGTAGT TTTAGTGAGC CATCAACCAA ACCGTTATTC GTAGTTGGTT TGGCAATAAG 1501ATTCGTGATT GCGCCTGAGC GAGACGAAAT ACGCGATCGC TGTTAAAAGG ACAATTACAA ACAGGAATCG AATGCAACCG TAAGCACTAA CGCGGACTCG CTCTGCTTTA TGCGCTAGCG ACAATTTTCC TGTTAATGTT TGTCCTTAGC TTACGTTGGC GCGCAGGAAC ACTGCCAGCG CGCGTCCTTG TGACGGTCGC 1601CATCAACAAT ATTTTCACCT GAATCAGGAT ATTCTTCTAA TACCTGGAAT GCTGTTTTCC CGGGGATCGC AGTGGTGAGT GTAGTTGTTA TAAAAGTGGA CTTAGTCCTA TAAGAAGATT ATGGACCTTA CGACAAAAGG GCCCCTAGCG TCACCACTCA AACCATGCAT CATCAGGAGT TTGGTACGTA GTAGTCCTCA 1701ACGGATAAAA TGCTTGATGG TCGGAAGAGG CATAAATTCC GTCAGCCAGT TTAGTCTGAC CATCTCATCT GTAACATCAT TGCCTATTTT ACGAACTACC AGCCTTCTCC GTATTTAAGG CAGTCGGTCA AATCAGACTG GTAGAGTAGA CATTGTAGTA TGGCAACGCT ACCTTTGCCA ACCGTTGCGA TGGAAACGGT 1801TGTTTCAGAA ACAACTCTGG CGCATCGGGC TTCCCATACA ATCGATAGAT TGTCGCACCT GATTGCCCGA CATTATCGCG ACAAAGTCTT TGTTGAGACC GCGTAGCCCG AAGGGTATGT TAGCTATCTA ACAGCGTGGA CTAACGGGCT GTAATAGCGC AGCCCATTTA TACCCATATA TCGGGTAAAT ATGGGTATAT 1901AATCAGCATC CATGTTGGAA TTTAATCGCG GCCTAGAGCA AGACGTTTCC CGTTGAATAT GGCTCATAAC ACCCCTTGTA TTAGTCGTAG GTACAACCTT AAATTAGCGC CGGATCTCGT TCTGCAAAGG GCAACTTATA CCGAGTATTG TGGGGAACAT TTACTGTTTA TGTAAGCAGA AATGACAAAT ACATTCGTCT 2001CAGTTTTATT GTTCATGACC AAAATCCCTT AACGTGAGTT TTCGTTCCAC TGAGCGTCAG ACCCCGTAGA AAAGATCAAA GTCAAAATAA CAAGTACTGG TTTTAGGGAA TTGCACTCAA AAGCAAGGTG ACTCGCAGTC TGGGGCATCT TTTCTAGTTT GGATCTTCTT GAGATCCTTT CCTAGAAGAA CTCTAGGAAA 2101TTTTCTGCGC GTAATCTGCT GCTTGCAAAC AAAAAAACCA CCGCTACCAG CGGTGGTTTG TTTGCCGGAT CAAGAGCTAC AAAAGACGCG CATTAGACGA CGAACGTTTG TTTTTTTGGT GGCGATGGTC GCCACCAAAC AAACGGCCTA GTTCTCGATG CAACTCTTTT TCCGAAGGTA GTTGAGAAAA AGGCTTCCAT 2201ACTGGCTTCA GCAGAGCGCA GATACCAAAT ACTGTCCTTC TAGTGTAGCC GTAGTTAGGC CACCACTTCA AGAACTCTGT TGACCGAAGT CGTCTCGCGT CTATGGTTTA TGACAGGAAG ATCACATCGG CATCAATCCG GTGGTGAAGT TCTTGAGACA AGCACCGCCT ACATACCTCG TCGTGGCGGA TGTATGGAGC 2301CTCTGCTAAT CCTGTTACCA GTGGCTGCTG CCAGTGGCGA TAAGTCGTGT CTTACCGGGT TGGACTCAAG ACGATAGTTA GAGACGATTA GGACAATGGT CACCGACGAC GGTCACCGCT ATTCAGCACA GAATGGCCCA ACCTGAGTTC TGCTATCAAT CCGGATAAGG CGCAGCGGTC GGCCTATTCC GCGTCGCCAG 2401GGGCTGAACG GGGGGTTCGT GCACACAGCC CAGCTTGGAG CGAACGACCT ACACCGAACT GAGATACCTA CAGCGTGAGC CCCGACTTGC CCCCCAAGCA CGTGTGTCGG GTCGAACCTC GCTTGCTGGA TGTGGCTTGA CTCTATGGAT GTCGCACTCG TATGAGAAAG CGCCACGCTT ATACTCTTTC GCGGTGCGAA 2501CCCGAAGGGA GAAAGGCGGA CAGGTATCCG GTAAGCGGCA GGGTCGGAAC AGGAGAGCGC ACGAGGGAGC TTCCAGGGGG GGGCTTCCCT CTTTCCGCCT GTCCATAGGC CATTCGCCGT CCCAGCCTTG TCCTCTCGCG TGCTCCCTCG AAGGTCCCCC AAACGCCTGG TATCTTTATA TTTGCGGACC ATAGAAATAT 2601GTCCTGTCGG GTTTCGCCAC CTCTGACTTG AGCGTCGATT TTTGTGATGC TCGTCAGGGG GGCGGAGCCT ATGGAAAAAC CAGGACAGCC CAAAGCGGTG GAGACTGAAC TCGCAGCTAA AAACACTACG AGCAGTCCCC CCGCCTCGGA TACCTTTTTG GCCAGCAACG CGGCCTTTTT CGGTCGTTGC GCCGGAAAAA 2701ACGGTTCCTG GCCTTTTGCT GGCCTTTTGC TCACATGTTC TTTCCTGCGT TATCCCCTGA TTCTGTGGAT AACCGTATTA TGCCAAGGAC CGGAAAACGA CCGGAAAACG AGTGTACAAG AAAGGACGCA ATAGGGGACT AAGACACCTA TTGGCATAAT CCGCCTTTGA GTGAGCTGAT GGCGGAAACT CACTCGACTA 2801ACCGCTCGCC GCAGCCGAAC GACCGAGCGC AGCGAGTCAG TGAGCGAGGA AGCGGAAGAG CGCCTGATGC GGTATTTTCT TGGCGAGCGG CGTCGGCTTG CTGGCTCGCG TCGCTCAGTC ACTCGCTCCT TCGCCTTCTC GCGGACTACG CCATAAAAGA CCTTACGCAT CTGTGCGGTA GGAATGCGTA GACACGCCAT 2901TTTCACACCG CATATATGGT GCACTCTCAG TACAATCTGC TCTGATGCCG CATAGTTAAG CCAGTATACA CTCCGCTATC AAAGTGTGGC GTATATACCA CGTGAGAGTC ATGTTAGACG AGACTACGGC GTATCAATTC GGTCATATGT GAGGCGATAG GCTACGTGAC TGGGTCATGG CGATGCACTG ACCCAGTACC 3001CTGCGCCCCG ACACCCGCCA ACACCCGCTG ACGCGCCCTG ACGGGCTTGT CTGCTCCCGG CATCCGCTTA CAGACAAGCT GACGCGGGGC TGTGGGCGGT TGTGGGCGAC TGCGCGGGAC TGCCCGAACA GACGAGGGCC GTAGGCGAAT GTCTGTTCGA GTGACCGTCT CCGGGAGCTG CACTGGCAGA GGCCCTCGAC 3101CATGTGTCAG AGGTTTTCAC CGTCATCACC GAAACGCGCG AGGCAGCTGC GGTAAAGCTC ATCAGCGTGG TCGTGAAGCG GTACACAGTC TCCAAAAGTG GCAGTAGTGG CTTTGCGCGC TCCGTCGACG CCATTTCGAG TAGTCGCACC AGCACTTCGC ATTCACAGAT GTCTGCCTGT TAAGTGTCTA CAGACGGACA 3201TCATCCGCGT CCAGCTCGTT GAGTTTCTCC AGAAGCGTTA ATGTCTGGCT TCTGATAAAG CGGGCCATGT TAAGGGCGGT AGTAGGCGCA GGTCGAGCAA CTCAAAGAGG TCTTCGCAAT TACAGACCGA AGACTATTTC GCCCGGTACA ATTCCCGCCA TTTTTCCTGT TTGGTCACTG AAAAAGGACA AACCAGTGAC 3301ATGCCTCCGT GTAAGGGGGA TTTCTGTTCA TGGGGGTAAT GATACCGATG AAACGAGAGA GGATGCTCAC GATACGGGTT TACGGAGGCA CATTCCCCCT AAAGACAAGT ACCCCCATTA CTATGGCTAC TTTGCTCTCT CCTACGAGTG CTATGCCCAA ACTGATGATG AACATGCCCG TGACTACTAC TTGTACGGGC 3401GTTACTGGAA CGTTGTGAGG GTAAACAACT GGCGGTATGG ATGCGGCGGG ACCAGAGAAA AATCACTCAG GGTCAATGCC CAATGACCTT GCAACACTCC CATTTGTTGA CCGCCATACC TACGCCGCCC TGGTCTCTTT TTAGTGAGTC CCAGTTACGG AGCGCTTCGT TAATACAGAT TCGCGAAGCA ATTATGTCTA 3501GTAGGTGTTC CACAGGGTAG CCAGCAGCAT CCTGCGATGC AGATCCGGAA CATAATGGTG CAGGGCGCTG ACTTCCGCGT CATCCACAAG GTGTCCCATC GGTCGTCGTA GGACGCTACG TCTAGGCCTT GTATTACCAC GTCCCGCGAC TGAAGGCGCA TTCCAGACTT TACGAAACAC AAGGTCTGAA ATGCTTTGTG 3601GGAAACCGAA GACCATTCAT GTTGTTGCTC AGGTCGCAGA CGTTTTGCAG CAGCAGTCGC TTCACGTTCG CTCGCGTATC CCTTTGGCTT CTGGTAAGTA CAACAACGAG TCCAGCGTCT GCAAAACGTC GTCGTCAGCG AAGTGCAAGC GAGCGCATAG GGTGATTCAT TCTGCTAACC CCACTAAGTA AGACGATTGG 3701AGTAAGGCAA CCCCGCCAGC CTAGCCGGGT CCTCAACGAC AGGAGCACGA TCATGCGCAC CCGTGGGGCC GCCATGCCGG TCATTCCGTT GGGGCGGTCG GATCGGCCCA GGAGTTGCTG TCCTCGTGCT AGTACGCGTG GGCACCCCGG CGGTACGGCC CGATAATGGC CTGCTTCTCG GCTATTACCG GACGAAGAGC 3801CCGAAACGTT TGGTGGCGGG ACCAGTGACG AAGGCTTGAG CGAGGGCGTG CAAGATTCCG AATACCGCAA GCGACAGGCC GGCTTTGCAA ACCACCGCCC TGGTCACTGC TTCCGAACTC GCTCCCGCAC GTTCTAAGGC TTATGGCGTT CGCTGTCCGG GATCATCGTC GCGCTCCAGC CTAGTAGCAG CGCGAGGTCG 3901GAAAGCGGTC CTCGCCGAAA ATGACCCAGA GCGCTGCCGG CACCTGTCCT ACGAGTTGCA TGATAAAGAA GACAGTCATA CTTTCGCCAG GAGCGGCTTT TACTGGGTCT CGCGACGGCC GTGGACAGGA TGCTCAACGT ACTATTTCTT CTGTCAGTAT AGTGCGGCGA CGATAGTCAT TCACGCCGCT GCTATCAGTA 4001GCCCCGCGCC CACCGGAAGG AGCTGACTGG GTTGAAGGCT CTCAAGGGCA TCGGTCGAGA TCCCGGTGCC TAATGAGTGA CGGGGCGCGG GTGGCCTTCC TCGACTGACC CAACTTCCGA GAGTTCCCGT AGCCAGCTCT AGGGCCACGG ATTACTCACT GCTAACTTAC ATTAATTGCG CGATTGAATG TAATTAACGC 4101TTGCGCTCAC TGCCCGCTTT CCAGTCGGGA AACCTGTCGT GCCAGCTGCA TTAATGAATC GGCCAACGCG CGGGGAGAGG AACGCGAGTG ACGGGCGAAA GGTCAGCCCT TTGGACAGCA CGGTCGACGT AATTACTTAG CCGGTTGCGC GCCCCTCTCC CGGTTTGCGT ATTGGGCGCC GCCAAACGCA TAACCCGCGG 4201AGGGTGGTTT TTCTTTTCAC CAGTGAGACG GGCAACAGCT GATTGCCCTT CACCGCCTGG CCCTGAGAGA GTTGCAGCAA TCCCACCAAA AAGAAAAGTG GTCACTCTGC CCGTTGTCGA CTAACGGGAA GTGGCGGACC GGGACTCTCT CAACGTCGTT GCGGTCCACG CTGGTTTGCC CGCCAGGTGC GACCAAACGG 4301CCAGCAGGCG AAAATCCTGT TTGATGGTGG TTAACGGCGG GATATAACAT GAGCTGTCTT CGGTATCGTC GTATCCCACT GGTCGTCCGC TTTTAGGACA AACTACCACC AATTGCCGCC CTATATTGTA CTCGACAGAA GCCATAGCAG CATAGGGTGA ACCGAGATAT CCGCACCAAC TGGCTCTATA GGCGTGGTTG 4401GCGCAGCCCG GACTCGGTAA TGGCGCGCAT TGCGCCCAGC GCCATCTGAT CGTTGGCAAC CAGCATCGCA GTGGGAACGA CGCGTCGGGC CTGAGCCATT ACCGCGCGTA ACGCGGGTCG CGGTAGACTA GCAACCGTTG GTCGTAGCGT CACCCTTGCT TGCCCTCATT CAGCATTTGC ACGGGAGTAA GTCGTAAACG 4501ATGGTTTGTT GAAAACCGGA CATGGCACTC CAGTCGCCTT CCCGTTCCGC TATCGGCTGA ATTTGATTGC GAGTGAGATA TACCAAACAA CTTTTGGCCT GTACCGTGAG GTCAGCGGAA GGGCAAGGCG ATAGCCGACT TAAACTAACG CTCACTCTAT TTTATGCCAG CCAGCCAGAC AAATACGGTC GGTCGGTCTG 4601GCAGACGCGC CGAGACAGAA CTTAATGGGC CCGCTAACAG CGCGATTTGC TGGTGACCCA ATGCGACCAG ATGCTCCACG CGTCTGCGCG GCTCTGTCTT GAATTACCCG GGCGATTGTC GCGCTAAACG ACCACTGGGT TACGCTGGTC TACGAGGTGC CCCAGTCGCG TACCGTCTTC GGGTCAGCGC ATGGCAGAAG 4701ATGGGAGAAA ATAATACTGT TGATGGGTGT CTGGTCAGAG ACATCAAGAA ATAACGCCGG AACATTAGTG CAGGCAGCTT TACCCTCTTT TATTATGACA ACTACCCACA GACCAGTCTC TGTAGTTCTT TATTGCGGCC TTGTAATCAC GTCCGTCGAA CCACAGCAAT GGCATCCTGG GGTGTCGTTA CCGTAGGACC 4801TCATCCAGCG GATAGTTAAT GATCAGCCCA CTGACGCGTT GCGCGAGAAG ATTGTGCACC GCCGCTTTAC AGGCTTCGAC AGTAGGTCGC CTATCAATTA CTAGTCGGGT GACTGCGCAA CGCGCTCTTC TAACACGTGG CGGCGAAATG TCCGAAGCTG GCCGCTTCGT TCTACCATCG CGGCGAAGCA AGATGGTAGC 4901ACACCACCAC GCTGGCACCC AGTTGATCGG CGCGAGATTT AATCGCCGCG ACAATTTGCG ACGGCGCGTG CAGGGCCAGA TGTGGTGGTG CGACCGTGGG TCAACTAGCC GCGCTCTAAA TTAGCGGCGC TGTTAAACGC TGCCGCGCAC GTCCCGGTCT CTGGAGGTGG CAACGCCAAT GACCTCCACC GTTGCGGTTA 5001CAGCAACGAC TGTTTGCCCG CCAGTTGTTG TGCCACGCGG TTGGGAATGT AATTCAGCTC CGCCATCGCC GCTTCCACTT GTCGTTGCTG ACAAACGGGC GGTCAACAAC ACGGTGCGCC AACCCTTACA TTAAGTCGAG GCGGTAGCGG CGAAGGTGAA TTTCCCGCGT TTTCGCAGAA AAAGGGCGCA AAAGCGTCTT 5101ACGTGGCTGG CCTGGTTCAC CACGCGGGAA ACGGTCTGAT AAGAGACACC GGCATACTCT GCGACATCGT ATAACGTTAC TGCACCGACC GGACCAAGTG GTGCGCCCTT TGCCAGACTA TTCTCTGTGG CCGTATGAGA CGCTGTAGCA TATTGCAATG TGGTTTCACA TTCACCACCC ACCAAAGTGT AAGTGGTGGG 5201TGAATTGACT CTCTTCCGGG CGCTATCATG CCATACCGCG AAAGGTTTTG CGCCATTCGA TGGTGTCCGG GATCTCGACG ACTTAACTGA GAGAAGGCCC GCGATAGTAC GGTATGGCGC TTTCCAAAAC GCGGTAAGCT ACCACAGGCC CTAGAGCTGC CTCTCCCTTA TGCGACTCCT GAGAGGGAAT ACGCTGAGGA 5301GCATTAGGAA GCAGCCCAGT AGTAGGTTGA GGCCGTTGAG CACCGCCGCC GCAAGGAATG GTGCATGCAA GGAGATGGCG CGTAATCCTT CGTCGGGTCA TCATCCAACT CCGGCAACTC GTGGCGGCGG CGTTCCTTAC CACGTACGTT CCTCTACCGC CCCAACAGTC CCCCGGCCAC GGGTTGTCAG GGGGCCGGTG 5401GGGGCCTGCC ACCATACCCA CGCCGAAACA AGCGCTCATG AGCCCGAAGT GGCGAGCCCG ATCTTCCCCA TCGGTGATGT CCCCGGACGG TGGTATGGGT GCGGCTTTGT TCGCGAGTAC TCGGGCTTCA CCGCTCGGGC TAGAAGGGGT AGCCACTACA CGGCGATATA GGCGCCAGCA GCCGCTATAT CCGCGGTCGT 5501ACCGCACCTG TGGCGCCGGT GATGCCGGCC ACGATGCGTC CGGCGTAGAG GATCGAGATC TCGATCCCGC GAAATTAATA TGGCGTGGAC ACCGCGGCCA CTACGGCCGG TGCTACGCAG GCCGCATCTC CTAGCTCTAG AGCTAGGGCG CTTTAATTAT CGACTCACTA TAGGGGAATT GCTGAGTGAT ATCCCCTTAA 5601GTGAGCGGAT AACAATTCCC CTCTAGAAAT AATTTTGTTT AACTTTAAGA AGGAGATATA CCATGGGCAG CAGCCATCAT CACTCGCCTA TTGTTAAGGG GAGATCTTTA TTAAAACAAA TTGAAATTCT TCCTCTATAT GGTACCCGTC GTCGGTAGTA CATCATCATC ACAGCAGCGG GTAGTAGTAG TGTCGTCGCC 5701CCTGGTGCCG CGCGGCAGCC ATAGGTCGAC TCTAGAGGAT CCAAGCCAAA GCACTAACGT TTTAGGTGAA GCTAAAAAAT GGACCACGGC GCGCCGTCGG TATCCAGCTG AGATCTCCTA GGTTCGGTTT CGTGATTGCA AAATCCACTT CGATTTTTTA TAAACGAATC TCAAGCACCG ATTTGCTTAG AGTTCGTGGC 5801AAAGCTGACA ACAATTTCAA CAAAGAACAA CAAAATGCTT TCTATGAAAT CTTGAACATG CCTAACTTGA ACGAAGAACA TTTCGACTGT TGTTAAAGTT GTTTCTTGTT GTTTTACGAA AGATACTTTA GAACTTGTAC GGATTGAACT TGCTTCTTGT ACGCAATGGT TTCATCCAAA TGCGTTACCA AAGTAGGTTT 5901 GCTTAAAAGA TGACCCAAGT CAAAGTGCTA ACCTTTTAGC AGAAGCTAAA AAGTTAAATG AATCTCAAGC ACCGAAAGCT CGAATTTTCT ACTGGGTTCA GTTTCACGAT TGGAAAATCG TCTTCGATTT TTCAATTTAC TTAGAGTTCG TGGCTTTCGA GATAACAAAT TCAACAAAGA CTATTGTTTA AGTTGTTTCT 6001ACAACAAAAT GCTTTCTATG AAATCTTACA TTTACCTAAC TTAAATGAAG AACAACGCAA TGGTTTCATC CAAAGCTTAA TGTTGTTTTA CGAAAGATAC TTTAGAATGT AAATGGATTG AATTTACTTC TTGTTGCGTT ACCAAAGTAG GTTTCGAATT AAGATGACCC AAGCCAAAGC TTCTACTGGG TTCGGTTTCG 6101GCTAACCTTT TAGCAGAAGC TAAAAAGCTA AATGATGCAC AAGCACCAAA AGCTGACAAC AAATTCAACA AAGAACAACA CGATTGGAAA ATCGTCTTCG ATTTTTCGAT TTACTACGTG TTCGTGGTTT TCGACTGTTG TTTAAGTTGT TTCTTGTTGT AAATGCTTTC TATGAAATTT TTTACGAAAG ATACTTTAAA 6201TACATTTACC TAACTTAACT GAAGAACAAC GTAACGGCTT CATCCAAAGC CTTAAAGACG ATCCCCGGTC GACTCTAGCG ATGTAAATGG ATTGAATTGA CTTCTTGTTG CATTGCCGAA GTAGGTTTCG GAATTTCTGC TAGGGGCCAG CTGAGATCGC GCAGCTTCCG GTGCTAGCAC CGTCGAAGGC CACGATCGTG 6301TGACACTTAC AAATTAATCC TTAATGGTAA AACATTGAAA GGCGAAACAA CTACTGAAGC TGTTGATGCT GCTACTGCAG ACTGTGAATG TTTAATTAGG AATTACCATT TTGTAACTTT CCGCTTTGTT GATGACTTCG ACAACTACGA CGATGACGTC AAAAAGTCTT CAAACAATAC TTTTTCAGAA GTTTGTTATG 6401GCTAACGACA ACGGTGTTGA CGGTGAATGG ACTTACGACG ATGCGACTAA GACCTTTACA GTTACTGAAA AACCAGAAGT CGATTGCTGT TGCCACAACT GCCACTTACC TGAATGCTGC TACGCTGATT CTGGAAATGT CAATGACTTT TTGGTCTTCA GATCGATGCG TCTGAATTAA CTAGCTACGC AGACTTAATT 6501CACCAGCCGT GACAACTTAC AAACTTGTTA TTAATGGTAA AACATTGAAA GGCGAAACAA CTACTAAAGC AGTAGACGCA GTGGTCGGCA CTGTTGAATG TTTGAACAAT AATTACCATT TTGTAACTTT CCGCTTTGTT GATGATTTCG TCATCTGCGT GAAACTGCAG AAAAAGCCTT CTTTGACGTC TTTTTCGGAA 6601CAAACAATAC GCTAACGACA ACGGTGTTGA TGGTGTTTGG ACTTATGATG ATGCGACTAA GACCTTTACG GTAACTGAAA GTTTGTTATG CGATTGCTGT TGCCACAACT ACCACAAACC TGAATACTAC TACGCTGATT CTGGAAATGC CATTGACTTT TGGTTACAGA GGTACCAGAT ACCAATGTCT CCATGGTCTA 6701CTTAGCAACT TTGTTGCAAC TGAAACCGAT GCTAACCGCGAATCGTTGA AACAACGTTG ACTTTGGCTA CGATTGGCG pS14L-spAG-MLuc16    1AGCGCCCAAT ACGCAAACCG CCTCTCCCCG CGCGTTGGCC GATTCATTAA TGCAGCTGGC ACGACAGGTT TCCCGACTGG TCGCGGGTTA TGCGTTTGGC GGAGAGGGGC GCGCAACCGG CTAAGTAATT ACGTCGACCG TGCTGTCCAA AGGGCTGACC AAAGCGGGCA GTGAGCGCAA TTTCGCCCGT CACTCGCGTT  101CGCAATTAAT GTGAGTTAGC TCACTCATTA GGCACCCCAG GCTTTACACT TTATGCTTCC GGCTCGTATG TTGTGTGGAA GCGTTAATTA CACTCAATCG AGTGAGTAAT CCGTGGGGTC CGAAATGTGA AATACGAAGG CCGAGCATAC AACACACCTT TTGTGAGCGG ATAACAATTT AACACTCGCC TATTGTTAAA  201CACACAGGAA ACAGCTATGA CCATGATTAC GCCAAGCTTT AGGGATAACA GGGTAATCGC CATGCATTAG TTATTAATAG GTGTGTCCTT TGTCGATACT GGTACTAATG CGGTTCGAAA TCCCTATTGT CCCATTAGCG GTACGTAATC AATAATTATC TAATCAATTA CGGGGTCATT ATTAGTTAAT GCCCCAGTAA  301AGTTCATAGC CCATATATGG AGTTCCGCGT TACATAACTT ACGGTAAATG GCCCGCCTGG CTGACCGCCC AACGACCCCC TCAAGTATCG GGTATATACC TCAAGGCGCA ATGTATTGAA TGCCATTTAC CGGGCGGACC GACTGGCGGG TTGCTGGGGG GCCCATTGAC GTCAATAATG CGGGTAACTG CAGTTATTAC  401ACGTATGTTC CCATAGTAAC GCCAATAGGG ACTTTCCATT GACGTCAATG GGTGGAGTAT TTACGGTAAA CTGCCCACTT TGCATACAAG GGTATCATTG CGGTTATCCC TGAAAGGTAA CTGCAGTTAC CCACCTCATA AATGCCATTT GACGGGTGAA GGCAGTACAT CAAGTGTATC CCGTCATGTA GTTCACATAG  501ATATGCCAAG TACGCCCCCT ATTGACGTCA ATGACGGTAA ATGGCCCGCC TGGCATTATG CCCAGTACAT GACCTTATGG TATACGGTTC ATGCGGGGGA TAACTGCAGT TACTGCCATT TACCGGGCGG ACCGTAATAC GGGTCATGTA CTGGAATACC GACTTTCCTA CTTGGCAGTA CTGAAAGGAT GAACCGTCAT  601CATCTACGTA TTAGTCATCG CTATTACCAT GGTGATGCGG TTTTGGCAGT ACATCAATGG GCGTGGATAG CGGTTTGACT GTAGATGCAT AATCAGTAGC GATAATGGTA CCACTACGCC AAAACCGTCA TGTAGTTACC CGCACCTATC GCCAAACTGA CACGGGGATT TCCAAGTCTC GTGCCCCTAA AGGTTCAGAG  701CACCCCATTG ACGTCAATGG GAGTTTGTTT TGGCACCAAA ATCAACGGGA CTTTCCAAAA TGTCGTAACA ACTCCGCCCC GTGGGGTAAC TGCAGTTACC CTCAAACAAA ACCGTGGTTT TAGTTGCCCT GAAAGGTTTT ACAGCATTGT TGAGGCGGGG ATTGACGCAA ATGGGCGGTA TAACTGCGTT TACCCGCCAT  801GGCGTGTACG GTGGGAGGTC TATATAAGCA GAGCTGGTTT AGTGAACCGT CAGATCCGCT AGACGTCTCA TTTAGGCATG CCGCACATGC CACCCTCCAG ATATATTCGT CTCGACCAAA TCACTTGGCA GTCTAGGCGA TCTGCAGAGT AAATCCGTAC GAAACCCCAG CGCAGCTTCT CTTTGGGGTC GCGTCGAAGA  901CTTCCTCCTG CTACTCTGGA TCCCAGACAC CATTGAAGAA ATAGTGATGA CGCAGTCTCC AGCCACCCTG TCTGTGTCTC GAAGGAGGAC GATGAGACCT AGGGTCTGTG GTAACTTCTT TATCACTACT GCGTCAGAGG TCGGTGGGAC AGACACAGAG CAGGGGAAAG AGTCACCCTC GTCCCCTTTC TCAGTGGGAG 1001TCCAGCAGCC ATCATCATCA TCATCACAGC AGCGGCCTGG TGCCGCGCGG CAGCCATAGG TCGACTCTAG AGGATCCAAG AGGTCGTCGG TAGTAGTAGT AGTAGTGTCG TCGCCGGACC ACGGCGCGCC GTCGGTATCC AGCTGAGATC TCCTAGGTTC CCAAAGCACT AACGTTTTAG GGTTTCGTGA TTGCAAAATC 1101GTGAAGCTAA AAAATTAAAC GAATCTCAAG CACCGAAAGC TGACAACAAT TTCAACAAAG AACAACAAAA TGCTTTCTAT CACTTCGATT TTTTAATTTG CTTAGAGTTC GTGGCTTTCG ACTGTTGTTA AAGTTGTTTC TTGTTGTTTT ACGAAAGATA GAAATCTTGA ACATGCCTAA CTTTAGAACT TGTACGGATT 1201CTTGAACGAA GAACAACGCA ATGGTTTCAT CCAAAGCTTA AAAGATGACC CAAGTCAAAG TGCTAACCTT TTAGCAGAAG GAACTTGCTT CTTGTTGCGT TACCAAAGTA GGTTTCGAAT TTTCTACTGG GTTCAGTTTC ACGATTGGAA AATCGTCTTC CTAAAAAGTT AAATGAATCT GATTTTTCAA TTTACTTAGA 1301CAAGCACCGA AAGCTGATAA CAAATTCAAC AAAGAACAAC AAAATGCTTT CTATGAAATC TTACATTTAC CTAACTTAAA GTTCGTGGCT TTCGACTATT GTTTAAGTTG TTTCTTGTTG TTTTACGAAA GATACTTTAG AATGTAAATG GATTGAATTT TGAAGAACAA CGCAATGGTT ACTTCTTGTT GCGTTACCAA 1401TCATCCAAAG CTTAAAAGAT GACCCAAGCC AAAGCGCTAA CCTTTTAGCA GAAGCTAAAA AGCTAAATGA TGCACAAGCA AGTAGGTTTC GAATTTTCTA CTGGGTTCGG TTTCGCGATT GGAAAATCGT CTTCGATTTT TCGATTTACT ACGTGTTCGT CCAAAAGCTG ACAACAAATT GGTTTTCGAC TGTTGTTTAA 1501CAACAAAGAA CAACAAAATG CTTTCTATGA AATTTTACAT TTACCTAACT TAACTGAAGA ACAACGTAAC GGCTTCATCC GTTGTTTCTT GTTGTTTTAC GAAAGATACT TTAAAATGTA AATGGATTGA ATTGACTTCT TGTTGCATTG CCGAAGTAGG AAAGCCTTAA AGACGATCCC TTTCGGAATT TCTGCTAGGG 1601CGGTCGACTC TAGCGGCAGC TTCCGGTGCT AGCACTGACA CTTACAAATT AATCCTTAAT GGTAAAACAT TGAAAGGCGA GCCAGCTGAG ATCGCCGTCG AAGGCCACGA TCGTGACTGT GAATGTTTAA TTAGGAATTA CCATTTTGTA ACTTTCCGCT AACAACTACT GAAGCTGTTG TTGTTGATGA CTTCGACAAC 1701ATGCTGCTAC TGCAGAAAAA GTCTTCAAAC AATACGCTAA CGACAACGGT GTTGACGGTG AATGGACTTA CGACGATGCG TACGACGATG ACGTCTTTTT CAGAAGTTTG TTATGCGATT GCTGTTGCCA CAACTGCCAC TTACCTGAAT GCTGCTACGC ACTAAGACCT TTACAGTTAC TGATTCTGGA AATGTCAATG 1801TGAAAAACCA GAAGTGATCG ATGCGTCTGA ATTAACACCA GCCGTGACAA CTTACAAACT TGTTATTAAT GGTAAAACAT ACTTTTTGGT CTTCACTAGC TACGCAGACT TAATTGTGGT CGGCACTGTT GAATGTTTGA ACAATAATTA CCATTTTGTA TGAAAGGCGA AACAACTACT ACTTTCCGCT TTGTTGATGA 1901AAAGCAGTAG ACGCAGAAAC TGCAGAAAAA GCCTTCAAAC AATACGCTAA CGACAACGGT GTTGATGGTG TTTGGACTTA TTTCGTCATC TGCGTCTTTG ACGTCTTTTT CGGAAGTTTG TTATGCGATT GCTGTTGCCA CAACTACCAC AAACCTGAAT TGATGATGCG ACTAAGACCT ACTACTACGC TGATTCTGGA 2001TTACGGTAAC TGAAATGGTT ACAGAGGTAC CGCGGGCCCG GGATCCACCG GCTAGCGGGA ATTCCAAATC AACTGAGTTC AATGCCATTG ACTTTACCAA TGTCTCCATG GCGCCCGGGC CCTAGGTGGC CGATCGCCCT TAAGGTTTAG TTGACTCAAG GATCCTAACA TTGACATTGT CTAGGATTGT AACTGTAACA 2101TGGTTTAGAA GGAAAATTTG GTATTACAAA CCTAGAGACG GATTTATTCA CAATCTGGGA GACAATGGAG GTCATGATCA ACCAAATCTT CCTTTTAAAC CATAATGTTT GGATCTCTGC CTAAATAAGT GTTAGACCCT CTGTTACCTC CAGTACTAGT AAGCAGATAT TGCAGATACT TTCGTCTATA ACGTCTATGA 2201GATAGAGCCA GCAACTTTGT TGCAACTGAA ACCGATGCTA ACCGCGGAAA AATGCCTGGC AAAAAACTGC CACTGGCAGT CTATCTCGGT CGTTGAAACA ACGTTGACTT TGGCTACGAT TGGCGCCTTT TTACGGACCG TTTTTTGACG GTGACCGTCA TATCATGGAA ATGGAAGCCA ATAGTACCTT TACCTTCGGT 2301ATGCTTTCAA AGCTGGCTGC ACCAGGGGAT GCCTTATCTG TCTTTCAAAA ATTAAGTGTA CAGCCAAAAT GAAGGTATAC TACGAAAGTT TCGACCGACG TGGTCCCCTA CGGAATAGAC AGAAAGTTTT TAATTCACAT GTCGGTTTTA CTTCCATATG ATTCCAGGAA GGTGTCACGA TAAGGTCCTT CCACAGTGCT 2401TTATGGTGGT GACAAGAAAA CTGGACAGGC AGGAATTGTT GGTGCAATTG TTGACATTCC CGAAATCTCT GGATTTAAGG AATACCACCA CTGTTCTTTT GACCTGTCCG TCCTTAACAA CCACGTTAAC AACTGTAAGG GCTTTAGAGA CCTAAATTCC AGATGGCACC CATGGAACAG TCTACCGTGG GTACCTTGTC 2501TTCATTGCTC AAGTTGATCG CTGCGCTTCC TGCACTACTG GATGTCTCAA AGGTCTTGCC AATGTTAAGT GCTCTGAACT AAGTAACGAG TTCAACTAGC GACGCGAAGG ACGTGATGAC CTACAGAGTT TCCAGAACGG TTACAATTCA CGAGACTTGA CCTGAAGAAA TGGCTGCCTG GGACTTCTTT ACCGACGGAC 2601ACAGGTGTGC AAGTTTTGCT GACAAGATTC AAAAAGAAGT TCACAATATC AAAGGCATGG CCGGCGATCG ATGAGCGGCC TGTCCACACG TTCAAAACGA CTGTTCTAAG TTTTTCTTCA AGTGTTATAG TTTCCGTACC GGCCGCTAGC TACTCGCCGG GCAATTTAAT TCCGGTTATT CGTTAAATTA AGGCCAATAA 2701TTCCACCATA TTGCCGTCTT TTGGCAATGT GAGGGCCCGG AAACCTGGCC CTGTCTTCTT GACGAGCATT CCTAGGGGTC AAGGTGGTAT AACGGCAGAA AACCGTTACA CTCCCGGGCC TTTGGACCGG GACAGAAGAA CTGCTCGTAA GGATCCCCAG TTTCCCCTCT CGCCAAAGGA AAAGGGGAGA GCGGTTTCCT 2801ATGCAAGGTC TGTTGAATGT CGTGAAGGAA GCAGTTCCTC TGGAAGCTTC TTGAAGACAA ACAACGTCTG TAGCGACCCT TACGTTCCAG ACAACTTACA GCACTTCCTT CGTCAAGGAG ACCTTCGAAG AACTTCTGTT TGTTGCAGAC ATCGCTGGGA TTGCAGGCAG CGGAACCCCC AACGTCCGTC GCCTTGGGGG 2901CACCTGGCGA CAGGTGCCTC TGCGGCCAAA AGCCACGTGT ATAAGATACA CCTGCAAAGG CGGCACAACC CCAGTGCCAC GTGGACCGCT GTCCACGGAG ACGCCGGTTT TCGGTGCACA TATTCTATGT GGACGTTTCC GCCGTGTTGG GGTCACGGTG GTTGTGAGTT GGATAGTTGT CAACACTCAA CCTATCAACA 3001GGAAAGAGTC AAATGGCTCA CCTCAAGCGT ATTCAACAAG GGGCTGAAGG ATGCCCAGAA GGTACCCCAT TGTATGGGAT CCTTTCTCAG TTTACCGAGT GGAGTTCGCA TAAGTTGTTC CCCGACTTCC TACGGGTCTT CCATGGGGTA ACATACCCTA CTGATCTGGG GCCTCGGTGC GACTAGACCC CGGAGCCACG 3101ACATGCTTTA CATGTGTTTA GTCGAGGTTA AAAAACGTCT AGGCCCCCCG AACCACGGGG ACGTGGTTTT CCTTTGAAAA TGTACGAAAT GTACACAAAT CAGCTCCAAT TTTTTGCAGA TCCGGGGGGC TTGGTGCCCC TGCACCAAAA GGAAACTTTT ACACGATGAT AATATGGCCA TGTGCTACTA TTATACCGGT 3201CCACCCATAC CTAGGCTTTT GCAAAGATCG ATCAGATCCC GGGGGGCAAT GAGATATGAA AAAGCCTGAA CTCACCGCGA GGTGGGTATG GATCCGAAAA CGTTTCTAGC TAGTCTAGGG CCCCCCGTTA CTCTATACTT TTTCGGACTT GAGTGGCGCT CGTCTGTCGA GAAGTTTCTG GCAGACAGCT CTTCAAAGAC 3301ATCGAAAAGT TCGACAGCGT CTCCGACCTG ATGCAGCTCT CGGAGGGCGA AGAATCTCGT GCTTTCAGCT TCGATGTAGG TAGCTTTTCA AGCTGTCGCA GAGGCTGGAC TACGTCGAGA GCCTCCCGCT TCTTAGAGCA CGAAAGTCGA AGCTACATCC AGGGCGTGGA TATGTCCTGC TCCCGCACCT ATACAGGACG 3401GGGTAAATAG CTGCGCCGAT GGTTTCTACA AAGATCGTTA TGTTTATCGG CACTTTGCAT CGGCCGCGCT CCCGATTCCG CCCATTTATC GACGCGGCTA CCAAAGATGT TTCTAGCAAT ACAAATAGCC GTGAAACGTA GCCGGCGCGA GGGCTAAGGC GAAGTGCTTG ACATTGGGGA CTTCACGAAC TGTAACCCCT 3501ATTCAGCGAG AGCCTGACCT ATTGCATCTC CCGCCGTGCA CAGGGTGTCA CGTTGCAAGA CCTGCCTGAA ACCGAACTGC TAAGTCGCTC TCGGACTGGA TAACGTAGAG GGCGGCACGT GTCCCACAGT GCAACGTTCT GGACGGACTT TGGCTTGACG CCGCTGTTCT GCAGCCGGTC GGCGACAAGA CGTCGGCCAG 3601GCGGAGGCCA TGGATGCGAT CGCTGCGGCC GATCTTAGCC AGACGAGCGG GTTCGGCCCA TTCGGACCGC AAGGAATCGG CGCCTCCGGT ACCTACGCTA GCGACGCCGG CTAGAATCGG TCTGCTCGCC CAAGCCGGGT AAGCCTGGCG TTCCTTAGCC TCAATACACT ACATGGCGTG AGTTATGTGA TGTACCGCAC 3701ATTTCATATG CGCGATTGCT GATCCCCATG TGTATCACTG GCAAACTGTG ATGGACGACA CCGTCAGTGC GTCCGTCGCG TAAAGTATAC GCGCTAACGA CTAGGGGTAC ACATAGTGAC CGTTTGACAC TACCTGCTGT GGCAGTCACG CAGGCAGCGC CAGGCTCTCG ATGAGCTGAT GTCCGAGAGC TACTCGACTA 3801GCTTTGGGCC GAGGACTGCC CCGAAGTCCG GCACCTCGTG CACGCGGATT TCGGCTCCAA CAATGTCCTG ACGGACAATG CGAAACCCGG CTCCTGACGG GGCTTCAGGC CGTGGAGCAC GTGCGCCTAA AGCCGAGGTT GTTACAGGAC TGCCTGTTAC GCCGCATAAC AGCGGTCATT CGGCGTATTG TCGCCAGTAA 3901GACTGGAGCG AGGCGATGTT CGGGGATTCC CAATACGAGG TCGCCAACAT CTTCTTCTGG AGGCCGTGGT TGGCTTGTAT CTGACCTCGC TCCGCTACAA GCCCCTAAGG GTTATGCTCC AGCGGTTGTA GAAGAAGACC TCCGGCACCA ACCGAACATA GGAGCAGCAG ACGCGCTACT CCTCGTCGTC TGCGCGATGA 4001TCGAGCGGAG GCATCCGGAG CTTGCAGGAT CGCCGCGGCT CCGGGCGTAT ATGCTCCGCA TTGGTCTTGA CCAACTCTAT AGCTCGCCTC CGTAGGCCTC GAACGTCCTA GCGGCGCCGA GGCCCGCATA TACGAGGCGT AACCAGAACT GGTTGAGATA CAGAGCTTGG TTGACGGCAA GTCTCGAACC AACTGCCGTT 4101TTTCGATGAT GCAGCTTGGG CGCAGGGTCG ATGCGACGCA ATCGTCCGAT CCGGAGCCGG GACTGTCGGG CGTACACAAA AAAGCTACTA CGTCGAACCC GCGTCCCAGC TACGCTGCGT TAGCAGGCTA GGCCTCGGCC CTGACAGCCC GCATGTGTTT TCGCCCGCAG AAGCGCGGCC AGCGGGCGTC TTCGCGCCGG 4201GTCTGGACCG ATGGCTGTGT AGAAGTACTC GCCGATAGTG GAAACCGACG CCCCAGCACT CGTCCGGATC GGGAGATGGG CAGACCTGGC TACCGACACA TCTTCATGAG CGGCTATCAC CTTTGGCTGC GGGGTCGTGA GCAGGCCTAG CCCTCTACCC GGAGGCTAAC TGAAACACGG CCTCCGATTG ACTTTGTGCC 4301AAGGAGACAA TACCGGAAGG AACCTCGACG TTAACTTGTT TATTGCAGCT TATAATGGTT ACAAATAAAG CAATAGCATC TTCCTCTGTT ATGGCCTTCC TTGGAGCTGC AATTGAACAA ATAACGTCGA ATATTACCAA TGTTTATTTC GTTATCGTAG ACAAATTTCA CAAATAAAGC TGTTTAAAGT GTTTATTTCG 4401ATTTATTACC CTGTTATCCC TAGAATTCAC TGGCCGTCGT TTTACAACGT CGTGACTGGG AAAACCCTGG CGTTACCCAA TAAATAATGG GACAATAGGG ATCTTAAGTG ACCGGCAGCA AAATGTTGCA GCACTGACCC TTTTGGGACC GCAATGGGTT CTTAATCGCC TTGCAGCACA GAATTAGCGG AACGTCGTGT 4501TCCCCCTTTC GCCAGCTGGC GTAATAGCGA AGAGGCCCGC ACCGATCGCC CTTCCCAACA GTTGCGCAGC CTGAATGGCG AGGGGGAAAG CGGTCGACCG CATTATCGCT TCTCCGGGCG TGGCTAGCGG GAAGGGTTGT CAACGCGTCG GACTTACCGC AATGGCGCCT GATGCGGTAT TTACCGCGGA CTACGCCATA 4601TTTCTCCTTA CGCATCTGTG CGGTATTTCA CACCGCATAC GTCAAAGCAA CCATAGTACG CGCCCTGTAG CGGCGCATTA AAAGAGGAAT GCGTAGACAC GCCATAAAGT GTGGCGTATG CAGTTTCGTT GGTATCATGC GCGGGACATC GCCGCGTAAT AGCGCGGCGG GTGTGGTGGT TCGCGCCGCC CACACCACCA 4701TACGCGCAGC GTGACCGCTA CACTTGCCAG CGCCCTAGCG CCCGCTCCTT TCGCTTTCTT CCCTTCCTTT CTCGCCACGT ATGCGCGTCG CACTGGCGAT GTGAACGGTC GCGGGATCGC GGGCGAGGAA AGCGAAAGAA GGGAAGGAAA GAGCGGTGCA TCGCCGGCTT TCCCCGTCAA AGCGGCCGAA AGGGGCAGTT 4801GCTCTAAATC GGGGGCTCCC TTTAGGGTTC CGATTTAGTG CTTTACGGCA CCTCGACCCC AAAAAACTTG ATTTGGGTGA CGAGATTTAG CCCCCGAGGG AAATCCCAAG GCTAAATCAC GAAATGCCGT GGAGCTGGGG TTTTTTGAAC TAAACCCACT TGGTTCACGT AGTGGGCCAT ACCAAGTGCA TCACCCGGTA 4901CGCCCTGATA GACGGTTTTT CGCCCTTTGA CGTTGGAGTC CACGTTCTTT AATAGTGGAC TCTTGTTCCA AACTGGAACA GCGGGACTAT CTGCCAAAAA GCGGGAAACT GCAACCTCAG GTGCAAGAAA TTATCACCTG AGAACAAGGT TTGACCTTGT ACACTCAACC CTATCTCGGG TGTGAGTTGG GATAGAGCCC 5001 CTATTCTTTT GATTTATAAG GGATTTTGCC GATTTCGGCC TATTGGTTAA AAAATGAGCT GATTTAACAA AAATTTAACG GATAAGAAAA CTAAATATTC CCTAAAACGG CTAAAGCCGG ATAACCAATT TTTTACTCGA CTAAATTGTT TTTAAATTGC CGAATTTTAA CAAAATATTA GCTTAAAATT GTTTTATAAT 5101ACGTTTACAA TTTTATGGTG CACTCTCAGT ACAATCTGCT CTGATGCCGC ATAGTTAAGC CAGCCCCGAC ACCCGCCAAC TGCAAATGTT AAAATACCAC GTGAGAGTCA TGTTAGACGA GACTACGGCG TATCAATTCG GTCGGGGCTG TGGGCGGTTG ACCCGCTGAC GCGCCCTGAC TGGGCGACTG CGCGGGACTG 5201GGGCTTGTCT GCTCCCGGCA TCCGCTTACA GACAAGCTGT GACCGTCTAG ACGAAAGGGC CTCGTGATAC GCCTATTTTT CCCGAACAGA CGAGGGCCGT AGGCGAATGT CTGTTCGACA CTGGCAGATC TGCTTTCCCG GAGCACTATG CGGATAAAAA ATAGGTTAAT GTCATGATAA TATCCAATTA CAGTACTATT 5301TAATGGTTTC TTAGACGTCA GGTGGCACTT TTCGGGGAAA TGTGCGCGGA ACCCCTATTT GTTTATTTTT CTAAATACAT ATTACCAAAG AATCTGCAGT CCACCGTGAA AAGCCCCTTT ACACGCGCCT TGGGGATAAA CAAATAAAAA GATTTATGTA TCAAATATGT ATCCGCTCAT AGTTTATACA TAGGCGAGTA 5401GAGACAATAA CCCTGATAAA TGCTTCAATA ATATTGAAAA AGGAAGAGTA TGAGTATTCA ACATTTCCGT GTCGCCCTTA CTCTGTTATT GGGACTATTT ACGAAGTTAT TATAACTTTT TCCTTCTCAT ACTCATAAGT TGTAAAGGCA CAGCGGGAAT TTCCCTTTTT TGCGGCATTT AAGGGAAAAA ACGCCGTAAA 5501TGCCTTCCTG TTTTTGCTCA CCCAGAAACG CTGGTGAAAG TAAAAGATGC TGAAGATCAG TTGGGTGCAC GAGTGGGTTA ACGGAAGGAC AAAAACGAGT GGGTCTTTGC GACCACTTTC ATTTTCTACG ACTTCTAGTC AACCCACGTG CTCACCCAAT CATCGAACTG GATCTCAACA GTAGCTTGAC CTAGAGTTGT 5601GCGGTAAGAT CCTTGAGAGT TTTCGCCCCG AAGAACGTTT TCCAATGATG AGCACTTTTA AAGTTCTGCT ATGTGGCGCG CGCCATTCTA GGAACTCTCA AAAGCGGGGC TTCTTGCAAA AGGTTACTAC TCGTGAAAAT TTCAAGACGA TACACCGCGC GTATTATCCC GTATTGACGC CATAATAGGG CATAACTGCG 5701CGGGCAAGAG CAACTCGGTC GCCGCATACA CTATTCTCAG AATGACTTGG TTGAGTACTC ACCAGTCACA GAAAAGCATC GCCCGTTCTC GTTGAGCCAG CGGCGTATGT GATAAGAGTC TTACTGAACC AACTCATGAG TGGTCAGTGT CTTTTCGTAG TTACGGATGG CATGACAGTA AATGCCTACC GTACTGTCAT 5801AGAGAATTAT GCAGTGCTGC CATAACCATG AGTGATAACA CTGCGGCCAA CTTACTTCTG ACAACGATCG GAGGACCGAA TCTCTTAATA CGTCACGACG GTATTGGTAC TCACTATTGT GACGCCGGTT GAATGAAGAC TGTTGCTAGC CTCCTGGCTT GGAGCTAACC GCTTTTTTGC CCTCGATTGG CGAAAAAACG 5901ACAACATGGG GGATCATGTA ACTCGCCTTG ATCGTTGGGA ACCGGAGCTG AATGAAGCCA TACCAAACGA CGAGCGTGAC TGTTGTACCC CCTAGTACAT TGAGCGGAAC TAGCAACCCT TGGCCTCGAC TTACTTCGGT ATGGTTTGCT GCTCGCACTG ACCACGATGC CTGTAGCAAT TGGTGCTACG GACATCGTTA 6001GGCAACAACG TTGCGCAAAC TATTAACTGG CGAACTACTT ACTCTAGCTT CCCGGCAACA ATTAATAGAC TGGATGGAGG CCGTTGTTGC AACGCGTTTG ATAATTGACC GCTTGATGAA TGAGATCGAA GGGCCGTTGT TAATTATCTG ACCTACCTCC CGGATAAAGT TGCAGGACCA GCCTATTTCA ACGTCCTGGT 6101CTTCTGCGCT CGGCCCTTCC GGCTGGCTGG TTTATTGCTG ATAAATCTGG AGCCGGTGAG CGTGGGTCTC GCGGTATCAT GAAGACGCGA GCCGGGAAGG CCGACCGACC AAATAACGAC TATTTAGACC TCGGCCACTC GCACCCAGAG CGCCATAGTA TGCAGCACTG GGGCCAGATG ACGTCGTGAC CCCGGTCTAC 6201GTAAGCCCTC CCGTATCGTA GTTATCTACA CGACGGGGAG TCAGGCAACT ATGGATGAAC GAAATAGACA GATCGCTGAG CATTCGGGAG GGCATAGCAT CAATAGATGT GCTGCCCCTC AGTCCGTTGA TACCTACTTG CTTTATCTGT CTAGCGACTC ATAGGTGCCT CACTGATTAA TATCCACGGA GTGACTAATT 6301GCATTGGTAA CTGTCAGACC AAGTTTACTC ATATATACTT TAGATTGATT TAAAACTTCA TTTTTAATTT AAAAGGATCT CGTAACCATT GACAGTCTGG TTCAAATGAG TATATATGAA ATCTAACTAA ATTTTGAAGT AAAAATTAAA TTTTCCTAGA AGGTGAAGAT CCTTTTTGAT TCCACTTCTA GGAAAAACTA 6401AATCTCATGA CCAAAATCCC TTAACGTGAG TTTTCGTTCC ACTGAGCGTC AGACCCCGTA GAAAAGATCA AAGGATCTTC TTAGAGTACT GGTTTTAGGG AATTGCACTC AAAAGCAAGG TGACTCGCAG TCTGGGGCAT CTTTTCTAGT TTCCTAGAAG TTGAGATCCT TTTTTTCTGC AACTCTAGGA AAAAAAGACG 6501GCGTAATCTG CTGCTTGCAA ACAAAAAAAC CACCGCTACC AGCGGTGGTT TGTTTGCCGG ATCAAGAGCT ACCAACTCTT CGCATTAGAC GACGAACGTT TGTTTTTTTG GTGGCGATGG TCGCCACCAA ACAAACGGCC TAGTTCTCGA TGGTTGAGAA TTTCCGAAGG TAACTGGCTT AAAGGCTTCC ATTGACCGAA 6601CAGCAGAGCG CAGATACCAA ATACTGTCCT TCTAGTGTAG CCGTAGTTAG GCCACCACTT CAAGAACTCT GTAGCACCGC GTCGTCTCGC GTCTATGGTT TATGACAGGA AGATCACATC GGCATCAATC CGGTGGTGAA GTTCTTGAGA CATCGTGGCG CTACATACCT CGCTCTGCTA GATGTATGGA GCGAGACGAT 6701ATCCTGTTAC CAGTGGCTGC TGCCAGTGGC GATAAGTCGT GTCTTACCGG GTTGGACTCA AGACGATAGT TACCGGATAA TAGGACAATG GTCACCGACG ACGGTCACCG CTATTCAGCA CAGAATGGCC CAACCTGAGT TCTGCTATCA ATGGCCTATT GGCGCAGCGG TCGGGCTGAA CCGCGTCGCC AGCCCGACTT 6801CGGGGGGTTC GTGCACACAG CCCAGCTTGG AGCGAACGAC CTACACCGAA CTGAGATACC TACAGCGTGA GCTATGAGAA GCCCCCCAAG CACGTGTGTC GGGTCGAACC TCGCTTGCTG GATGTGGCTT GACTCTATGG ATGTCGCACT CGATACTCTT AGCGCCACGC TTCCCGAAGG TCGCGGTGCG AAGGGCTTCC 6901GAGAAAGGCG GACAGGTATC CGGTAAGCGG CAGGGTCGGA ACAGGAGAGC GCACGAGGGA GCTTCCAGGG GGAAACGCCT CTCTTTCCGC CTGTCCATAG GCCATTCGCC GTCCCAGCCT TGTCCTCTCG CGTGCTCCCT CGAAGGTCCC CCTTTGCGGA GGTATCTTTA TAGTCCTGTC CCATAGAAAT ATCAGGACAG 7001GGGTTTCGCC ACCTCTGACT TGAGCGTCGA TTTTTGTGAT GCTCGTCAGG GGGGCGGAGC CTATGGAAAA ACGCCAGCAA CCCAAAGCGG TGGAGACTGA ACTCGCAGCT AAAAACACTA CGAGCAGTCC CCCCGCCTCG GATACCTTTT TGCGGTCGTT CGCGGCCTTT TTACGGTTCC GCGCCGGAAA AATGCCAAGG 7101TGGCCTTTTG CTGGCCTTTT GCTCACATGT TCTTTCCTGC GTTATCCCCT GATTCTGTGG ATAACCGTAT TACCGCCTTT ACCGGAAAAC GACCGGAAAA CGAGTGTACA AGAAAGGACG CAATAGGGGA CTAAGACACC TATTGGCATA ATGGCGGAAA GAGTGAGCTG ATACCGCTCG CTCACTCGAC TATGGCGAGC 7201CCGCAGCCGA ACGACCGAGC GCAGCGAGTC AGTGAGCGAG GAAGCGGAAGGGCGTCGGCT TGCTGGCTCG CGTCGCTCAG TCACTCGCTC CTTCGCCTTCpS14L-spAG-ΔN-MLuc15    1AGCGCCCAAT ACGCAAACCG CCTCTCCCCG CGCGTTGGCC GATTCATTAA TGCAGCTGGC ACGACAGGTT TCCCGACTGG TCGCGGGTTA TGCGTTTGGC GGAGAGGGGC GCGCAACCGG CTAAGTAATT ACGTCGACCG TGCTGTCCAA AGGGCTGACC AAAGCGGGCA GTGAGCGCAA TTTCGCCCGT CACTCGCGTT  101CGCAATTAAT GTGAGTTAGC TCACTCATTA GGCACCCCAG GCTTTACACT TTATGCTTCC GGCTCGTATG TTGTGTGGAA GCGTTAATTA CACTCAATCG AGTGAGTAAT CCGTGGGGTC CGAAATGTGA AATACGAAGG CCGAGCATAC AACACACCTT TTGTGAGCGG ATAACAATTT AACACTCGCC TATTGTTAAA  201CACACAGGAA ACAGCTATGA CCATGATTAC GCCAAGCTTT AGGGATAACA GGGTAATCGC CATGCATTAG TTATTAATAG GTGTGTCCTT TGTCGATACT GGTACTAATG CGGTTCGAAA TCCCTATTGT CCCATTAGCG GTACGTAATC AATAATTATC TAATCAATTA CGGGGTCATT ATTAGTTAAT GCCCCAGTAA  301AGTTCATAGC CCATATATGG AGTTCCGCGT TACATAACTT ACGGTAAATG GCCCGCCTGG CTGACCGCCC AACGACCCCC TCAAGTATCG GGTATATACC TCAAGGCGCA ATGTATTGAA TGCCATTTAC CGGGCGGACC GACTGGCGGG TTGCTGGGGG GCCCATTGAC GTCAATAATG CGGGTAACTG CAGTTATTAC  401ACGTATGTTC CCATAGTAAC GCCAATAGGG ACTTTCCATT GACGTCAATG GGTGGAGTAT TTACGGTAAA CTGCCCACTT TGCATACAAG GGTATCATTG CGGTTATCCC TGAAAGGTAA CTGCAGTTAC CCACCTCATA AATGCCATTT GACGGGTGAA GGCAGTACAT CAAGTGTATC CCGTCATGTA GTTCACATAG  501ATATGCCAAG TACGCCCCCT ATTGACGTCA ATGACGGTAA ATGGCCCGCC TGGCATTATG CCCAGTACAT GACCTTATGG TATACGGTTC ATGCGGGGGA TAACTGCAGT TACTGCCATT TACCGGGCGG ACCGTAATAC GGGTCATGTA CTGGAATACC GACTTTCCTA CTTGGCAGTA CTGAAAGGAT GAACCGTCAT  601CATCTACGTA TTAGTCATCG CTATTACCAT GGTGATGCGG TTTTGGCAGT ACATCAATGG GCGTGGATAG CGGTTTGACT GTAGATGCAT AATCAGTAGC GATAATGGTA CCACTACGCC AAAACCGTCA TGTAGTTACC CGCACCTATC GCCAAACTGA CACGGGGATT TCCAAGTCTC GTGCCCCTAA AGGTTCAGAG  701CACCCCATTG ACGTCAATGG GAGTTTGTTT TGGCACCAAA ATCAACGGGA CTTTCCAAAA TGTCGTAACA ACTCCGCCCC GTGGGGTAAC TGCAGTTACC CTCAAACAAA ACCGTGGTTT TAGTTGCCCT GAAAGGTTTT ACAGCATTGT TGAGGCGGGG ATTGACGCAA ATGGGCGGTA TAACTGCGTT TACCCGCCAT  801GGCGTGTACG GTGGGAGGTC TATATAAGCA GAGCTGGTTT AGTGAACCGT CAGATCCGCT AGACGTCTCA TTTAGGCATG CCGCACATGC CACCCTCCAG ATATATTCGT CTCGACCAAA TCACTTGGCA GTCTAGGCGA TCTGCAGAGT AAATCCGTAC GAAACCCCAG CGCAGCTTCT CTTTGGGGTC GCGTCGAAGA  901CTTCCTCCTG CTACTCTGGA TCCCAGACAC CATTGAAGAA ATAGTGATGA CGCAGTCTCC AGCCACCCTG TCTGTGTCTC GAAGGAGGAC GATGAGACCT AGGGTCTGTG GTAACTTCTT TATCACTACT GCGTCAGAGG TCGGTGGGAC AGACACAGAG CAGGGGAAAG AGTCACCCTC GTCCCCTTTC TCAGTGGGAG 1001TCCAGCAGCC ATCATCATCA TCATCACAGC AGCGGCCTGG TGCCGCGCGG CAGCCATAGG TCGACTCTAG AGGATCCAAG AGGTCGTCGG TAGTAGTAGT AGTAGTGTCG TCGCCGGACC ACGGCGCGCC GTCGGTATCC AGCTGAGATC TCCTAGGTTC CCAAAGCACT AACGTTTTAG GGTTTCGTGA TTGCAAAATC 1101GTGAAGCTAA AAAATTAAAC GAATCTCAAG CACCGAAAGC TGACAACAAT TTCAACAAAG AACAACAAAA TGCTTTCTAT CACTTCGATT TTTTAATTTG CTTAGAGTTC GTGGCTTTCG ACTGTTGTTA AAGTTGTTTC TTGTTGTTTT ACGAAAGATA GAAATCTTGA ACATGCCTAA CTTTAGAACT TGTACGGATT 1201CTTGAACGAA GAACAACGCA ATGGTTTCAT CCAAAGCTTA AAAGATGACC CAAGTCAAAG TGCTAACCTT TTAGCAGAAG GAACTTGCTT CTTGTTGCGT TACCAAAGTA GGTTTCGAAT TTTCTACTGG GTTCAGTTTC ACGATTGGAA AATCGTCTTC CTAAAAAGTT AAATGAATCT GATTTTTCAA TTTACTTAGA 1301CAAGCACCGA AAGCTGATAA CAAATTCAAC AAAGAACAAC AAAATGCTTT CTATGAAATC TTACATTTAC CTAACTTAAA GTTCGTGGCT TTCGACTATT GTTTAAGTTG TTTCTTGTTG TTTTACGAAA GATACTTTAG AATGTAAATG GATTGAATTT TGAAGAACAA CGCAATGGTT ACTTCTTGTT GCGTTACCAA 1401TCATCCAAAG CTTAAAAGAT GACCCAAGCC AAAGCGCTAA CCTTTTAGCA GAAGCTAAAA AGCTAAATGA TGCACAAGCA AGTAGGTTTC GAATTTTCTA CTGGGTTCGG TTTCGCGATT GGAAAATCGT CTTCGATTTT TCGATTTACT ACGTGTTCGT CCAAAAGCTG ACAACAAATT GGTTTTCGAC TGTTGTTTAA 1501CAACAAAGAA CAACAAAATG CTTTCTATGA AATTTTACAT TTACCTAACT TAACTGAAGA ACAACGTAAC GGCTTCATCC GTTGTTTCTT GTTGTTTTAC GAAAGATACT TTAAAATGTA AATGGATTGA ATTGACTTCT TGTTGCATTG CCGAAGTAGG AAAGCCTTAA AGACGATCCC TTTCGGAATT TCTGCTAGGG 1601CGGTCGACTC TAGCGGCAGC TTCCGGTGCT AGCACTGACA CTTACAAATT AATCCTTAAT GGTAAAACAT TGAAAGGCGA GCCAGCTGAG ATCGCCGTCG AAGGCCACGA TCGTGACTGT GAATGTTTAA TTAGGAATTA CCATTTTGTA ACTTTCCGCT AACAACTACT GAAGCTGTTG TTGTTGATGA CTTCGACAAC 1701ATGCTGCTAC TGCAGAAAAA GTCTTCAAAC AATACGCTAA CGACAACGGT GTTGACGGTG AATGGACTTA CGACGATGCG TACGACGATG ACGTCTTTTT CAGAAGTTTG TTATGCGATT GCTGTTGCCA CAACTGCCAC TTACCTGAAT GCTGCTACGC ACTAAGACCT TTACAGTTAC TGATTCTGGA AATGTCAATG 1801TGAAAAACCA GAAGTGATCG ATGCGTCTGA ATTAACACCA GCCGTGACAA CTTACAAACT TGTTATTAAT GGTAAAACAT ACTTTTTGGT CTTCACTAGC TACGCAGACT TAATTGTGGT CGGCACTGTT GAATGTTTGA ACAATAATTA CCATTTTGTA TGAAAGGCGA AACAACTACT ACTTTCCGCT TTGTTGATGA 1901AAAGCAGTAG ACGCAGAAAC TGCAGAAAAA GCCTTCAAAC AATACGCTAA CGACAACGGT GTTGATGGTG TTTGGACTTA TTTCGTCATC TGCGTCTTTG ACGTCTTTTT CGGAAGTTTG TTATGCGATT GCTGTTGCCA CAACTACCAC AAACCTGAAT TGATGATGCG ACTAAGACCT ACTACTACGC TGATTCTGGA 2001TTACGGTAAC TGAAATGGTT ACAGAGGTAC CAGATCTTAG CAACTTTGTT GCAACTGAAA CCGATGCTAA CCGCGGAAAA AATGCCATTG ACTTTACCAA TGTCTCCATG GTCTAGAATC GTTGAAACAA CGTTGACTTT GGCTACGATT GGCGCCTTTT ATGCCTGGCA AAAAACTGCC TACGGACCGT TTTTTGACGG 2101ACTGGCAGTT ATCATGGAAA TGGAAGCCAA TGCTTTCAAA GCTGGCTGCA CCAGGGGATG CCTTATCTGT CTTTCAAAAA TGACCGTCAA TAGTACCTTT ACCTTCGGTT ACGAAAGTTT CGACCGACGT GGTCCCCTAC GGAATAGACA GAAAGTTTTT TTAAGTGTAC AGCCAAAATG AATTCACATG TCGGTTTTAC 2201AAGGTATACA TTCCAGGAAG GTGTCACGAT TATGGTGGTG ACAAGAAAAC TGGACAGGCA GGAATTGTTG GTGCAATTGT TTCCATATGT AAGGTCCTTC CACAGTGCTA ATACCACCAC TGTTCTTTTG ACCTGTCCGT CCTTAACAAC CACGTTAACA TGACATTCCC GAAATCTCTG ACTGTAAGGG CTTTAGAGAC 2301GATTTAAGGA GATGGCACCC ATGGAACAGT TCATTGCTCA AGTTGATCGC TGCGCTTCCT GCACTACTGG ATGTCTCAAA CTAAATTCCT CTACCGTGGG TACCTTGTCA AGTAACGAGT TCAACTAGCG ACGCGAAGGA CGTGATGACC TACAGAGTTT GGTCTTGCCA ATGTTAAGTG CCAGAACGGT TACAATTCAC 2401CTCTGAACTC CTGAAGAAAT GGCTGCCTGA CAGGTGTGCA AGTTTTGCTG ACAAGATTCA AAAAGAAGTT CACAATATCA GAGACTTGAG GACTTCTTTA CCGACGGACT GTCCACACGT TCAAAACGAC TGTTCTAAGT TTTTCTTCAA GTGTTATAGT AAGGCATGGC CGGCGATCGA TTCCGTACCG GCCGCTAGCT 2501TGAGCGGCCG CAATTTAATT CCGGTTATTT TCCACCATAT TGCCGTCTTT TGGCAATGTG AGGGCCCGGA AACCTGGCCC ACTCGCCGGC GTTAAATTAA GGCCAATAAA AGGTGGTATA ACGGCAGAAA ACCGTTACAC TCCCGGGCCT TTGGACCGGG TGTCTTCTTG ACGAGCATTC ACAGAAGAAC TGCTCGTAAG 2601CTAGGGGTCT TTCCCCTCTC GCCAAAGGAA TGCAAGGTCT GTTGAATGTC GTGAAGGAAG CAGTTCCTCT GGAAGCTTCT GATCCCCAGA AAGGGGAGAG CGGTTTCCTT ACGTTCCAGA CAACTTACAG CACTTCCTTC GTCAAGGAGA CCTTCGAAGA TGAAGACAAA CAACGTCTGT ACTTCTGTTT GTTGCAGACA 2701AGCGACCCTT TGCAGGCAGC GGAACCCCCC ACCTGGCGAC AGGTGCCTCT GCGGCCAAAA GCCACGTGTA TAAGATACAC TCGCTGGGAA ACGTCCGTCG CCTTGGGGGG TGGACCGCTG TCCACGGAGA CGCCGGTTTT CGGTGCACAT ATTCTATGTG CTGCAAAGGC GGCACAACCC GACGTTTCCG CCGTGTTGGG 2801CAGTGCCACG TTGTGAGTTG GATAGTTGTG GAAAGAGTCA AATGGCTCAC CTCAAGCGTA TTCAACAAGG GGCTGAAGGA GTCACGGTGC AACACTCAAC CTATCAACAC CTTTCTCAGT TTACCGAGTG GAGTTCGCAT AAGTTGTTCC CCGACTTCCT TGCCCAGAAG GTACCCCATT ACGGGTCTTC CATGGGGTAA 2901GTATGGGATC TGATCTGGGG CCTCGGTGCA CATGCTTTAC ATGTGTTTAG TCGAGGTTAA AAAACGTCTA GGCCCCCCGA CATACCCTAG ACTAGACCCC GGAGCCACGT GTACGAAATG TACACAAATC AGCTCCAATT TTTTGCAGAT CCGGGGGGCT ACCACGGGGA CGTGGTTTTC TGGTGCCCCT GCACCAAAAG 3001CTTTGAAAAA CACGATGATA ATATGGCCAC CACCCATACC TAGGCTTTTG CAAAGATCGA TCAGATCCCG GGGGGCAATG GAAACTTTTT GTGCTACTAT TATACCGGTG GTGGGTATGG ATCCGAAAAC GTTTCTAGCT AGTCTAGGGC CCCCCGTTAC AGATATGAAA AAGCCTGAAC TCTATACTTT TTCGGACTTG 3101TCACCGCGAC GTCTGTCGAG AAGTTTCTGA TCGAAAAGTT CGACAGCGTC TCCGACCTGA TGCAGCTCTC GGAGGGCGAA AGTGGCGCTG CAGACAGCTC TTCAAAGACT AGCTTTTCAA GCTGTCGCAG AGGCTGGACT ACGTCGAGAG CCTCCCGCTT GAATCTCGTG CTTTCAGCTT CTTAGAGCAC GAAAGTCGAA 3201CGATGTAGGA GGGCGTGGAT ATGTCCTGCG GGTAAATAGC TGCGCCGATG GTTTCTACAA AGATCGTTAT GTTTATCGGC GCTACATCCT CCCGCACCTA TACAGGACGC CCATTTATCG ACGCGGCTAC CAAAGATGTT TCTAGCAATA CAAATAGCCG ACTTTGCATC GGCCGCGCTC TGAAACGTAG CCGGCGCGAG 3301CCGATTCCGG AAGTGCTTGA CATTGGGGAA TTCAGCGAGA GCCTGACCTA TTGCATCTCC CGCCGTGCAC AGGGTGTCAC GGCTAAGGCC TTCACGAACT GTAACCCCTT AAGTCGCTCT CGGACTGGAT AACGTAGAGG GCGGCACGTG TCCCACAGTG GTTGCAAGAC CTGCCTGAAA CAACGTTCTG GACGGACTTT 3401CCGAACTGCC CGCTGTTCTG CAGCCGGTCG CGGAGGCCAT GGATGCGATC GCTGCGGCCG ATCTTAGCCA GACGAGCGGG GGCTTGACGG GCGACAAGAC GTCGGCCAGC GCCTCCGGTA CCTACGCTAG CGACGCCGGC TAGAATCGGT CTGCTCGCCC TTCGGCCCAT TCGGACCGCA AAGCCGGGTA AGCCTGGCGT 3501 AGGAATCGGT CAATACACTA CATGGCGTGA TTTCATATGC GCGATTGCTG ATCCCCATGT GTATCACTGG CAAACTGTGA TCCTTAGCCA GTTATGTGAT GTACCGCACT AAAGTATACG CGCTAACGAC TAGGGGTACA CATAGTGACC GTTTGACACT TGGACGACAC CGTCAGTGCG ACCTGCTGTG GCAGTCACGC 3601TCCGTCGCGC AGGCTCTCGA TGAGCTGATG CTTTGGGCCG AGGACTGCCC CGAAGTCCGG CACCTCGTGC ACGCGGATTT AGGCAGCGCG TCCGAGAGCT ACTCGACTAC GAAACCCGGC TCCTGACGGG GCTTCAGGCC GTGGAGCACG TGCGCCTAAA CGGCTCCAAC AATGTCCTGA GCCGAGGTTG TTACAGGACT 3701CGGACAATGG CCGCATAACA GCGGTCATTG ACTGGAGCGA GGCGATGTTC GGGGATTCCC AATACGAGGT CGCCAACATC GCCTGTTACC GGCGTATTGT CGCCAGTAAC TGACCTCGCT CCGCTACAAG CCCCTAAGGG TTATGCTCCA GCGGTTGTAG TTCTTCTGGA GGCCGTGGTT AAGAAGACCT CCGGCACCAA 3801GGCTTGTATG GAGCAGCAGA CGCGCTACTT CGAGCGGAGG CATCCGGAGC TTGCAGGATC GCCGCGGCTC CGGGCGTATA CCGAACATAC CTCGTCGTCT GCGCGATGAA GCTCGCCTCC GTAGGCCTCG AACGTCCTAG CGGCGCCGAG GCCCGCATAT TGCTCCGCAT TGGTCTTGAC ACGAGGCGTA ACCAGAACTG 3301CAACTCTATC AGAGCTTGGT TGACGGCAAT TTCGATGATG CAGCTTGGGC GCAGGGTCGA TGCGACGCAA TCGTCCGATC GTTGAGATAG TCTCGAACCA ACTGCCGTTA AAGCTACTAC GTCGAACCCG CGTCCCAGCT ACGCTGCGTT AGCAGGCTAG CGGAGCCGGG ACTGTCGGGC GCCTCGGCCC TGACAGCCCG 4001GTACACAAAT CGCCCGCAGA AGCGCGGCCG TCTGGACCGA TGGCTGTGTA GAAGTACTCG CCGATAGTGG AAACCGACGC CATGTGTTTA GCGGGCGTCT TCGCGCCGGC AGACCTGGCT ACCGACACAT CTTCATGAGC GGCTATCACC TTTGGCTGCG CCCAGCACTC GTCCGGATCG GGGTCGTGAG CAGGCCTAGC 4101GGAGATGGGG GAGGCTAACT GAAACACGGA AGGAGACAAT ACCGGAAGGA ACCTCGACGT TAACTTGTTT ATTGCAGCTT CCTCTACCCC CTCCGATTGA CTTTGTGCCT TCCTCTGTTA TGGCCTTCCT TGGAGCTGCA ATTGAACAAA TAACGTCGAA ATAATGGTTA CAAATAAAGC TATTACCAAT GTTTATTTCG 4201AATAGCATCA CAAATTTCAC AAATAAAGCA TTTATTACCC TGTTATCCCT AGAATTCACT GGCCGTCGTT TTACAACGTC TTATCGTAGT GTTTAAAGTG TTTATTTCGT AAATAATGGG ACAATAGGGA TCTTAAGTGA CCGGCAGCAA AATGTTGCAG GTGACTGGGA AAACCCTGGC CACTGACCCT TTTGGGACCG 4301GTTACCCAAC TTAATCGCCT TGCAGCACAT CCCCCTTTCG CCAGCTGGCG TAATAGCGAA GAGGCCCGCA CCGATCGCCC CAATGGGTTG AATTAGCGGA ACGTCGTGTA GGGGGAAAGC GGTCGACCGC ATTATCGCTT CTCCGGGCGT GGCTAGCGGG TTCCCAACAG TTGCGCAGCC AAGGGTTGTC AACGCGTCGG 4401TGAATGGCGA ATGGCGCCTG ATGCGGTATT TTCTCCTTAC GCATCTGTGC GGTATTTCAC ACCGCATACG TCAAAGCAAC ACTTACCGCT TACCGCGGAC TACGCCATAA AAGAGGAATG CGTAGACACG CCATAAAGTG TGGCGTATGC AGTTTCGTTG CATAGTACGC GCCCTGTAGC GTATCATGCG CGGGACATCG 4501GGCGCATTAA GCGCGGCGGG TGTGGTGGTT ACGCGCAGCG TGACCGCTAC ACTTGCCAGC GCCCTAGCGC CCGCTCCTTT CCGCGTAATT CGCGCCGCCC ACACCACCAA TGCGCGTCGC ACTGGCGATG TGAACGGTCG CGGGATCGCG GGCGAGGAAA CGCTTTCTTC CCTTCCTTTC GCGAAAGAAG GGAAGGAAAG 4601TCGCCACGTT CGCCGGCTTT CCCCGTCAAG CTCTAAATCG GGGGCTCCCT TTAGGGTTCC GATTTAGTGC TTTACGGCAC AGCGGTGCAA GCGGCCGAAA GGGGCAGTTC GAGATTTAGC CCCCGAGGGA AATCCCAAGG CTAAATCACG AAATGCCGTG CTCGACCCCA AAAAACTTGA GAGCTGGGGT TTTTTGAACT 4701TTTGGGTGAT GGTTCACGTA GTGGGCCATC GCCCTGATAG ACGGTTTTTC GCCCTTTGAC GTTGGAGTCC ACGTTCTTTA AAACCCACTA CCAAGTGCAT CACCCGGTAG CGGGACTATC TGCCAAAAAG CGGGAAACTG CAACCTCAGG TGCAAGAAAT ATAGTGGACT CTTGTTCCAA TATCACCTGA GAACAAGGTT 4801ACTGGAACAA CACTCAACCC TATCTCGGGC TATTCTTTTG ATTTATAAGG GATTTTGCCG ATTTCGGCCT ATTGGTTAAA TGACCTTGTT GTGAGTTGGG ATAGAGCCCG ATAAGAAAAC TAAATATTCC CTAAAACGGC TAAAGCCGGA TAACCAATTT AAATGAGCTG ATTTAACAAA TTTACTCGAC TAAATTGTTT 4901AATTTAACGC GAATTTTAAC AAAATATTAA CGTTTACAAT TTTATGGTGC ACTCTCAGTA CAATCTGCTC TGATGCCGCA TTAAATTGCG CTTAAAATTG TTTTATAATT GCAAATGTTA AAATACCACG TGAGAGTCAT GTTAGACGAG ACTACGGCGT TAGTTAAGCC AGCCCCGACA ATCAATTCGG TCGGGGCTGT 5001CCCGCCAACA CCCGCTGACG CGCCCTGACG GGCTTGTCTG CTCCCGGCAT CCGCTTACAG ACAAGCTGTG ACCGTCTAGA GGGCGGTTGT GGGCGACTGC GCGGGACTGC CCGAACAGAC GAGGGCCGTA GGCGAATGTC TGTTCGACAC TGGCAGATCT CGAAAGGGCC TCGTGATACG GCTTTCCCGG AGCACTATGC 5101CCTATTTTTA TAGGTTAATG TCATGATAAT AATGGTTTCT TAGACGTCAG GTGGCACTTT TCGGGGAAAT GTGCGCGGAA GGATAAAAAT ATCCAATTAC AGTACTATTA TTACCAAAGA ATCTGCAGTC CACCGTGAAA AGCCCCTTTA CACGCGCCTT CCCCTATTTG TTTATTTTTC GGGGATAAAC AAATAAAAAG 5201TAAATACATT CAAATATGTA TCCGCTCATG AGACAATAAC CCTGATAAAT GCTTCAATAA TATTGAAAAA GGAAGAGTAT ATTTATGTAA GTTTATACAT AGGCGAGTAC TCTGTTATTG GGACTATTTA CGAAGTTATT ATAACTTTTT CCTTCTCATA GAGTATTCAA CATTTCCGTG CTCATAAGTT GTAAAGGCAC 5301TCGCCCTTAT TCCCTTTTTT GCGGCATTTT GCCTTCCTGT TTTTGCTCAC CCAGAAACGC TGGTGAAAGT AAAAGATGCT AGCGGGAATA AGGGAAAAAA CGCCGTAAAA CGGAAGGACA AAAACGAGTG GGTCTTTGCG ACCACTTTCA TTTTCTACGA GAAGATCAGT TGGGTGCACG CTTCTAGTCA ACCCACGTGC 5401AGTGGGTTAC ATCGAACTGG ATCTCAACAG CGGTAAGATC CTTGAGAGTT TTCGCCCCGA AGAACGTTTT CCAATGATGA TCACCCAATG TAGCTTGACC TAGAGTTGTC GCCATTCTAG GAACTCTCAA AAGCGGGGCT TCTTGCAAAA GGTTACTACT GCACTTTTAA AGTTCTGCTA CGTGAAAATT TCAAGACGAT 5501TGTGGCGCGG TATTATCCCG TATTGACGCC GGGCAAGAGC AACTCGGTCG CCGCATACAC TATTCTCAGA ATGACTTGGT AAAAGCATCT TACGGATGGC ATGACAGTAA GAGAATTATG CAGTGCTGCC ATAACCATGA GTGATAACAC TGCGGCCAAC TGAGTACTCA CCAGTCACAG TTACTTCTGA CAACGATCGG 5601TTTTCGTAGA ATGCCTACCG TACTGTCATT CTCTTAATAC GTCACGACGG TATTGGTACT CACTATTGTG ACGCCGGTTG TTTTCGTAGA ATGCCTACCG TACTGTCATT CTCTTAATAC GTCACGACGG TATTGGTACT CACTATTGTG ACGCCGGTTG AATGAAGACT GTTGCTAGCC AATGAAGACT GTTGCTAGCC 5701AGGACCGAAG GAGCTAACCG CTTTTTTGCA CAACATGGGG GATCATGTAA CTCGCCTTGA TCGTTGGGAA CCGGAGCTGA TCCTGGCTTC CTCGATTGGC GAAAAAACGT GTTGTACCCC CTAGTACATT GAGCGGAACT AGCAACCCTT GGCCTCGACT ATGAAGCCAT ACCAAACGAC TACTTCGGTA TGGTTTGCTG 5801GAGCGTGACA CCACGATGCC TGTAGCAATG GCAACAACGT TGCGCAAACT ATTAACTGGC GAACTACTTA CTCTAGCTTC CTCGCACTGT GGTGCTACGG ACATCGTTAC CGTTGTTGCA ACGCGTTTGA TAATTGACCG CTTGATGAAT GAGATCGAAG CCGGCAACAA TTAATAGACT GGCCGTTGTT AATTATCTGA 5901GGATGGAGGC GGATAAAGTT GCAGGACCAC TTCTGCGCTC GGCCCTTCCG GCTGGCTGGT TTATTGCTGA TAAATCTGGA CCTACCTCCG CCTATTTCAA CGTCCTGGTG AAGACGCGAG CCGGGAAGGC CGACCGACCA AATAACGACT ATTTAGACCT GCCGGTGAGC GTGGGTCTCG CGGCCACTCG CACCCAGAGC 6001CGGTATCATT GCAGCACTGG GGCCAGATGG TAAGCCCTCC CGTATCGTAG TTATCTACAC GACGGGGAGT CAGGCAACTA GCCATAGTAA CGTCGTGACC CCGGTCTACC ATTCGGGAGG GCATAGCATC AATAGATGTG CTGCCCCTCA GTCCGTTGAT TGGATGAACG AAATAGACAG ACCTACTTGC TTTATCTGTC 6101ATCGCTGAGA TAGGTGCCTC ACTGATTAAG CATTGGTAAC TGTCAGACCA AGTTTACTCA TATATACTTT AGATTGATTT TAGCGACTCT ATCCACGGAG TGACTAATTC GTAACCATTG ACAGTCTGGT TCAAATGAGT ATATATGAAA TCTAACTAAA AAAACTTCAT TTTTAATTTA TTTTGAAGTA AAAATTAAAT 6201AAAGGATCTA GGTGAAGATC CTTTTTGATA ATCTCATGAC CAAAATCCCT TAACGTGAGT TTTCGTTCCA CTGAGCGTCA TTTCCTAGAT CCACTTCTAG GAAAAACTAT TAGAGTACTG GTTTTAGGGA ATTGCACTCA AAAGCAAGGT GACTCGCAGT GACCCCGTAG AAAAGATCAA CTGGGGCATC TTTTCTAGTT 6301AGGATCTTCT TGAGATCCTT TTTTTCTGCG CGTAATCTGC TGCTTGCAAA CAAAAAAACC ACCGCTACCA GCGGTGGTTT TCCTAGAAGA ACTCTAGGAA AAAAAGACGC GCATTAGACG ACGAACGTTT GTTTTTTTGG TGGCGATGGT CGCCACCAAA GTTTGCCGGA TCAAGAGCTA CAAACGGCCT AGTTCTCGAT 6401CCAACTCTTT TTCCGAAGGT AACTGGCTTC AGCAGAGCGC AGATACCAAA TACTGTCCTT CTAGTGTAGC CGTAGTTAGG GGTTGAGAAA AAGGCTTCCA TTGACCGAAG TCGTCTCGCG TCTATGGTTT ATGACAGGAA GATCACATCG GCATCAATCC CCACCACTTC AAGAACTCTG GGTGGTGAAG TTCTTGAGAC 6501TAGCACCGCC TACATACCTC GCTCTGCTAA TCCTGTTACC AGTGGCTGCT GCCAGTGGCG ATAAGTCGTG TCTTACCGGG ATCGTGGCGG ATGTATGGAG CGAGACGATT AGGACAATGG TCACCGACGA CGGTCACCGC TATTCAGCAC AGAATGGCCC TTGGACTCAA GACGATAGTT AACCTGAGTT CTGCTATCAA 6601ACCGGATAAG GCGCAGCGGT CGGGCTGAAC GGGGGGTTCG TGCACACAGC CCAGCTTGGA GCGAACGACC TACACCGAAC TGGCCTATTC CGCGTCGCCA GCCCGACTTG CCCCCCAAGC ACGTGTGTCG GGTCGAACCT CGCTTGCTGG ATGTGGCTTG TGAGATACCT ACAGCGTGAG ACTCTATGGA TGTCGCACTC 6701CTATGAGAAA GCGCCACGCT TCCCGAAGGG AGAAAGGCGG ACAGGTATCC GGTAAGCGGC AGGGTCGGAA CAGGAGAGCG GATACTCTTT CGCGGTGCGA AGGGCTTCCC TCTTTCCGCC TGTCCATAGG CCATTCGCCG TCCCAGCCTT GTCCTCTCGC CACGAGGGAG CTTCCAGGGG GTGCTCCCTC GAAGGTCCCC 6801GAAACGCCTG GTATCTTTAT AGTCCTGTCG GGTTTCGCCA CCTCTGACTT GAGCGTCGAT TTTTGTGATG CTCGTCAGGG CTTTGCGGAC CATAGAAATA TCAGGACAGC CCAAAGCGGT GGAGACTGAA CTCGCAGCTA AAAACACTAC GAGCAGTCCC GGGCGGAGCC TATGGAAAAA CCCGCCTCGG ATACCTTTTT 6901CGCCAGCAAC GCGGCCTTTT TACGGTTCCT GGCCTTTTGC TGGCCTTTTG CTCACATGTT CTTTCCTGCG TTATCCCCTG GCGGTCGTTG CGCCGGAAAA ATGCCAAGGA CCGGAAAACG ACCGGAAAAC GAGTGTACAA GAAAGGACGC AATAGGGGAC ATTCTGTGGA TAACCGTATT TAAGACACCT ATTGGCATAA 7001ACCGCCTTTG AGTGAGCTGA TACCGCTCGC CGCAGCCGAA CGACCGAGCG CAGCGAGTCA GTGAGCGAGG AAGCGGAAGTGGCGGAAAC TCACTCGACT ATGGCGAGCG GCGTCGGCTT GCTGGCTCGC GTCGCTCAGT CACTCGCTCC TTCGCCTTC

APPENDIX 5 Sequence of the plasmid encoding bioSNAP25-ΔN-MLuc hybrid.pS14LbioSNAP25-ΔN-MLuc-CITE-Hyg1    1AGCGCCCAAT ACGCAAACCG CCTCTCCCCG CGCGTTGGCC GATTCATTAA TGCAGCTGGC ACGACAGGTT TCCCGACTGG TCGCGGGTTA TGCGTTTGGC GGAGAGGGGC GCGCAACCGG CTAAGTAATT ACGTCGACCG TGCTGTCCAA AGGGCTGACC AAAGCGGGCA GTGAGCGCAA TTTCGCCCGT CACTCGCGTT  101CGCAATTAAT GTGAGTTAGC TCACTCATTA GGCACCCCAG GCTTTACACT TTATGCTTCC GGCTCGTATG TTGTGTGGAA GCGTTAATTA CACTCAATCG AGTGAGTAAT CCGTGGGGTC CGAAATGTGA AATACGAAGG CCGAGCATAC AACACACCTT TTGTGAGCGG ATAACAATTT AACACTCGCC TATTGTTAAA  201CACACAGGAA ACAGCTATGA CCATGATTAC GCCAAGCTTT AGGGATAACA GGGTAATCGC CATGCATTAG TTATTAATAG GTGTGTCCTT TGTCGATACT GGTACTAATG CGGTTCGAAA TCCCTATTGT CCCATTAGCG GTACGTAATC AATAATTATC TAATCAATTA CGGGGTCATT ATTAGTTAAT GCCCCAGTAA  301AGTTCATAGC CCATATATGG AGTTCCGCGT TACATAACTT ACGGTAAATG GCCCGCCTGG CTGACCGCCC AACGACCCCC TCAAGTATCG GGTATATACC TCAAGGCGCA ATGTATTGAA TGCCATTTAC CGGGCGGACC GACTGGCGGG TTGCTGGGGG GCCCATTGAC GTCAATAATG CGGGTAACTG CAGTTATTAC  401ACGTATGTTC CCATAGTAAC GCCAATAGGG ACTTTCCATT GACGTCAATG GGTGGAGTAT TTACGGTAAA CTGCCCACTT TGCATACAAG GGTATCATTG CGGTTATCCC TGAAAGGTAA CTGCAGTTAC CCACCTCATA AATGCCATTT GACGGGTGAA GGCAGTACAT CAAGTGTATC CCGTCATGTA GTTCACATAG  501ATATGCCAAG TACGCCCCCT ATTGACGTCA ATGACGGTAA ATGGCCCGCC TGGCATTATG CCCAGTACAT GACCTTATGG TATACGGTTC ATGCGGGGGA TAACTGCAGT TACTGCCATT TACCGGGCGG ACCGTAATAC GGGTCATGTA CTGGAATACC GACTTTCCTA CTTGGCAGTA CTGAAAGGAT GAACCGTCAT  601CATCTACGTA TTAGTCATCG CTATTACCAT GGTGATGCGG TTTTGGCAGT ACATCAATGG GCGTGGATAG CGGTTTGACT GTAGATGCAT AATCAGTAGC GATAATGGTA CCACTACGCC AAAACCGTCA TGTAGTTACC CGCACCTATC GCCAAACTGA CACGGGGATT TCCAAGTCTC GTGCCCCTAA AGGTTCAGAG  701CACCCCATTG ACGTCAATGG GAGTTTGTTT TGGCACCAAA ATCAACGGGA CTTTCCAAAA TGTCGTAACA ACTCCGCCCC GTGGGGTAAC TGCAGTTACC CTCAAACAAA ACCGTGGTTT TAGTTGCCCT GAAAGGTTTT ACAGCATTGT TGAGGCGGGG ATTGACGCAA ATGGGCGGTA TAACTGCGTT TACCCGCCAT  801GGCGTGTACG GTGGGAGGTC TATATAAGCA GAGCTGGTTT AGTGAACCGT CAGATCCGCT AGACGTCTCA TTTAGGCATG CCGCACATGC CACCCTCCAG ATATATTCGT CTCGACCAAA TCACTTGGCA GTCTAGGCGA TCTGCAGAGT AAATCCGTAC GAAACCCCAG CGCAGCTTCT CTTTGGGGTC GCGTCGAAGA  901CTTCCTCCTG CTACTCTGGA TCCCAGACAC CATTGAAGAA ATAGTGATGA CGCAGTCTCC AGCCACCCTG TCTGTGTCTC GAAGGAGGAC GATGAGACCT AGGGTCTGTG GTAACTTCTT TATCACTACT GCGTCAGAGG TCGGTGGGAC AGACACAGAG CAGGGGAAAG AGTCACCCTC GTCCCCTTTC TCAGTGGGAG 1001TCCTCAGGCG GCGCAAGCAG CCTGAGACAG ATTCTGGACT CCCAGAAAAT GGAGTGGAGG TCCAACGCCG GGGGCAGCGG AGGAGTCCGC CGCGTTCGTC GGACTCTGTC TAAGACCTGA GGGTCTTTTA CCTCACCTCC AGGTTGCGGC CCCCGTCGCC TAGGGATAAC AGGGTAATCG ATCCCTATTG TCCCATTAGC 1101CCGAGGACGC AGACATGCGT AATGAACTGG AGGAGATGCA GAGGAGGGCT GACCAGCTGG CTGATGAGTC CCTGGAAAGC GGCTCCTGCG TCTGTACGCA TTACTTGACC TCCTCTACGT CTCCTCCCGA CTGGTCGACC GACTACTCAG GGACCTTTCG ACCCGTCGCA TGCTGCAGCT TGGGCAGCGT ACGACGTCGA 1201GGTCGAAGAG AGTAAAGATG CTGGCATCAG GACTTTGGTT ATGTTGGATG AGCAAGGCGA ACAACTGGAA CGCATTGAGG CCAGCTTCTC TCATTTCTAC GACCGTAGTC CTGAAACCAA TACAACCTAC TCGTTCCGCT TGTTGACCTT GCGTAACTCC AAGGGATGGA CCAAATCAAT TTCCCTACCT GGTTTAGTTA 1301AAGGATATGA AAGAAGCAGA AAAGAATTTG ACGGACCTAG GAAAATTCTG CGGGCTTTGT GTGTGTCCCT GTAACAAGCT TTCCTATACT TTCTTCGTCT TTTCTTAAAC TGCCTGGATC CTTTTAAGAC GCCCGAAACA CACACAGGGA CATTGTTCGA TAAATCCAGT GATGCTTACA ATTTAGGTCA CTACGAATGT 1401AAAAAGCCTG GGGCAATAAT CAGGATGGAG TAGTGGCCAG CCAGCCTGCC CGTGTGGTGG ATGAACGGGA GCAGATGGCC TTTTTCGGAC CCCGTTATTA GTCCTACCTC ATCACCGGTC GGTCGGACGG GCACACCACC TACTTGCCCT CGTCTACCGG ATCAGTGGTG GCTTCATCCG TAGTCACCAC CGAAGTAGGC 1501CAGGGTAACA AACGATGCCC GGGAAAATGA AATGGATGAA AACCTAGAGC AGGTGAGCGG CATCATCGGA AACCTCCGTC GTCCCATTGT TTGCTACGGG CCCTTTTACT TTACCTACTT TTGGATCTCG TCCACTCGCC GTAGTAGCCT TTGGAGGCAG ATATGGCCCT AGACATGGGC TATACCGGGA TCTGTACCCG 1601AATGAGATTG ACACCCAGAA TCGCCAGATT GACAGGATCA TGGAGAAGGC TGACTCCAAC AAAACCAGAA TTGATGAAGC TTACTCTAAC TGTGGGTCTT AGCGGTCTAA CTGTCCTAGT ACCTCTTCCG ACTGAGGTTG TTTTGGTCTT AACTACTTCG CAACCAACGT GCAACAAAGA GTTGGTTGCA CGTTGTTTCT 1701TGCTGGGAAG TGGGGAGATC TCCGCGGCCC GGGATCCACC GGCTAGCGGG AATTCCAAAT CAACTGAGTT CGATCCTAAC ACGACCCTTC ACCCCTCTAG AGGCGCCGGG CCCTAGGTGG CCGATCGCCC TTAAGGTTTA GTTGACTCAA GCTAGGATTG ATTGACATTG TTGGTTTAGA TAACTGTAAC AACCAAATCT 1801AGGAAAATTT GGTATTACAA ACCTAGAGAC GGATTTATTC ACAATCTGGG AGACAATGGA GGTCATGATC AAAGCAGATA TCCTTTTAAA CCATAATGTT TGGATCTCTG CCTAAATAAG TGTTAGACCC TCTGTTACCT CCAGTACTAG TTTCGTCTAT TTGCAGATAC TGATAGAGCC AACGTCTATG ACTATCTCGG 1901AGCAACTTTG TTGCAACTGA AACCGATGCT AACCGCGGAA AAATGCCTGG CAAAAAACTG CCACTGGCAG TTATCATGGA TCGTTGAAAC AACGTTGACT TTGGCTACGA TTGGCGCCTT TTTACGGACC GTTTTTTGAC GGTGACCGTC AATAGTACCT AATGGAAGCC AATGCTTTCA TTACCTTCGG TTACGAAAGT 2001AAGCTGGCTG CACCAGGGGA TGCCTTATCT GTCTTTCAAA AATTAAGTGT ACAGCCAAAA TGAAGGTATA CATTCCAGGA TTCGACCGAC GTGGTCCCCT ACGGAATAGA CAGAAAGTTT TTAATTCACA TGTCGGTTTT ACTTCCATAT GTAAGGTCCT AGGTGTCACG ATTATGGTGG TCCACAGTGC TAATACCACC 2101TGACAAGAAA ACTGGACAGG CAGGAATTGT TGGTGCAATT GTTGACATTC CCGAAATCTC TGGATTTAAG GAGATGGCAC ACTGTTCTTT TGACCTGTCC GTCCTTAACA ACCACGTTAA CAACTGTAAG GGCTTTAGAG ACCTAAATTC CTCTACCGTG CCATGGAACA GTTCATTGCT GGTACCTTGT CAAGTAACGA 2201CAAGTTGATC GCTGCGCTTC CTGCACTACT GGATGTCTCA AAGGTCTTGC CAATGTTAAG TGCTCTGAAC TCCTGAAGAA GTTCAACTAG CGACGCGAAG GACGTGATGA CCTACAGAGT TTCCAGAACG GTTACAATTC ACGAGACTTG AGGACTTCTT ATGGCTGCCT GACAGGTGTG TACCGACGGA CTGTCCACAC 2301CAAGTTTTGC TGACAAGATT CAAAAAGAAG TTCACAATAT CAAAGGCATG GCCGGCGATC GATGAGCGGC CGCAATTTAA GTTCAAAACG ACTGTTCTAA GTTTTTCTTC AAGTGTTATA GTTTCCGTAC CGGCCGCTAG CTACTCGCCG GCGTTAAATT TTCCGGTTAT TTTCCACCAT AAGGCCAATA AAAGGTGGTA 2401ATTGCCGTCT TTTGGCAATG TGAGGGCCCG GAAACCTGGC CCTGTCTTCT TGACGAGCAT TCCTAGGGGT CTTTCCCCTC TAACGGCAGA AAACCGTTAC ACTCCCGGGC CTTTGGACCG GGACAGAAGA ACTGCTCGTA AGGATCCCCA GAAAGGGGAG TCGCCAAAGG AATGCAAGGT AGCGGTTTCC TTACGTTCCA 2501CTGTTGAATG TCGTGAAGGA AGCAGTTCCT CTGGAAGCTT CTTGAAGACA AACAACGTCT GTAGCGACCC TTTGCAGGCA GACAACTTAC AGCACTTCCT TCGTCAAGGA GACCTTCGAA GAACTTCTGT TTGTTGCAGA CATCGCTGGG AAACGTCCGT GCGGAACCCC CCACCTGGCG CGCCTTGGGG GGTGGACCGC 2601ACAGGTGCCT CTGCGGCCAA AAGCCACGTG TATAAGATAC ACCTGCAAAG GCGGCACAAC CCCAGTGCCA CGTTGTGAGT TGTCCACGGA GACGCCGGTT TTCGGTGCAC ATATTCTATG TGGACGTTTC CGCCGTGTTG GGGTCACGGT GCAACACTCA TGGATAGTTG TGGAAAGAGT ACCTATCAAC ACCTTTCTCA 2701CAAATGGCTC ACCTCAAGCG TATTCAACAA GGGGCTGAAG GATGCCCAGA AGGTACCCCA TTGTATGGGA TCTGATCTGG GTTTACCGAG TGGAGTTCGC ATAAGTTGTT CCCCGACTTC CTACGGGTCT TCCATGGGGT AACATACCCT AGACTAGACC GGCCTCGGTG CACATGCTTT CCGGAGCCAC GTGTACGAAA 2801ACATGTGTTT AGTCGAGGTT AAAAAACGTC TAGGCCCCCC GAACCACGGG GACGTGGTTT TCCTTTGAAA AACACGATGA TGTACACAAA TCAGCTCCAA TTTTTTGCAG ATCCGGGGGG CTTGGTGCCC CTGCACCAAA AGGAAACTTT TTGTGCTACT TAATATGGCC ACCACCCATA ATTATACCGG TGGTGGGTAT 2901CCTAGGCTTT TGCAAAGATC GATCAGATCC CGGGGGGCAA TGAGATATGA AAAAGCCTGA ACTCACCGCG ACGTCTGTCG GGATCCGAAA ACGTTTCTAG CTAGTCTAGG GCCCCCCGTT ACTCTATACT TTTTCGGACT TGAGTGGCGC TGCAGACAGC AGAAGTTTCT GATCGAAAAG TCTTCAAAGA CTAGCTTTTC 3001TTCGACAGCG TCTCCGACCT GATGCAGCTC TCGGAGGGCG AAGAATCTCG TGCTTTCAGC TTCGATGTAG GAGGGCGTGG AAGCTGTCGC AGAGGCTGGA CTACGTCGAG AGCCTCCCGC TTCTTAGAGC ACGAAAGTCG AAGCTACATC CTCCCGCACC ATATGTCCTG CGGGTAAATA TATACAGGAC GCCCATTTAT 3101GCTGCGCCGA TGGTTTCTAC AAAGATCGTT ATGTTTATCG GCACTTTGCA TCGGCCGCGC TCCCGATTCC GGAAGTGCTT CGACGCGGCT ACCAAAGATG TTTCTAGCAA TACAAATAGC CGTGAAACGT AGCCGGCGCG AGGGCTAAGG CCTTCACGAA GACATTGGGG AATTCAGCGA CTGTAACCCC TTAAGTCGCT 3201GAGCCTGACC TATTGCATCT CCCGCCGTGC ACAGGGTGTC ACGTTGCAAG ACCTGCCTGA AACCGAACTG CCCGCTGTTC CTCGGACTGG ATAACGTAGA GGGCGGCACG TGTCCCACAG TGCAACGTTC TGGACGGACT TTGGCTTGAC GGGCGACAAG TGCAGCCGGT CGCGGAGGCC ACGTCGGCCA GCGCCTCCGG 3301ATGGATGCGA TCGCTGCGGC CGATCTTAGC CAGACGAGCG GGTTCGGCCC ATTCGGACCG CAAGGAATCG GTCAATACAC TACCTACGCT AGCGACGCCG GCTAGAATCG GTCTGCTCGC CCAAGCCGGG TAAGCCTGGC GTTCCTTAGC CAGTTATGTG TACATGGCGT GATTTCATAT ATGTACCGCA CTAAAGTATA 3401GCGCGATTGC TGATCCCCAT GTGTATCACT GGCAAACTGT GATGGACGAC ACCGTCAGTG CGTCCGTCGC GCAGGCTCTC CGCGCTAACG ACTAGGGGTA CACATAGTGA CCGTTTGACA CTACCTGCTG TGGCAGTCAC GCAGGCAGCG CGTCCGAGAG GATGAGCTGA TGCTTTGGGC CTACTCGACT ACGAAACCCG 3501CGAGGACTGC CCCGAAGTCC GGCACCTCGT GCACGCGGAT TTCGGCTCCA ACAATGTCCT GACGGACAAT GGCCGCATAA GCTCCTGACG GGGCTTCAGG CCGTGGAGCA CGTGCGCCTA AAGCCGAGGT TGTTACAGGA CTGCCTGTTA CCGGCGTATT CAGCGGTCAT TGACTGGAGC GTCGCCAGTA ACTGACCTCG 3601GAGGCGATGT TCGGGGATTC CCAATACGAG GTCGCCAACA TCTTCTTCTG GAGGCCGTGG TTGGCTTGTA TGGAGCAGCA CTCCGCTACA AGCCCCTAAG GGTTATGCTC CAGCGGTTGT AGAAGAAGAC CTCCGGCACC AACCGAACAT ACCTCGTCGT GACGCGCTAC TTCGAGCGGA CTGCGCGATG AAGCTCGCCT 3701GGCATCCGGA GCTTGCAGGA TCGCCGCGGC TCCGGGCGTA TATGCTCCGC ATTGGTCTTG ACCAACTCTA TCAGAGCTTG CCGTAGGCCT CGAACGTCCT AGCGGCGCCG AGGCCCGCAT ATACGAGGCG TAACCAGAAC TGGTTGAGAT AGTCTCGAAC GTTGACGGCA ATTTCGATGA CAACTGCCGT TAAAGCTACT 3801TGCAGCTTGG GCGCAGGGTC GATGCGACGC AATCGTCCGA TCCGGAGCCG GGACTGTCGG GCGTACACAA ATCGCCCGCA ACGTCGAACC CGCGTCCCAG CTACGCTGCG TTAGCAGGCT AGGCCTCGGC CCTGACAGCC CGCATGTGTT TAGCGGGCGT GAAGCGCGGC CGTCTGGACC CTTCGCGCCG GCAGACCTGG 3901GATGGCTGTG TAGAAGTACT CGCCGATAGT GGAAACCGAC GCCCCAGCAC TCGTCCGGAT CGGGAGATGG GGGAGGCTAA CTACCGACAC ATCTTCATGA GCGGCTATCA CCTTTGGCTG CGGGGTCGTG AGCAGGCCTA GCCCTCTACC CCCTCCGATT CTGAAACACG GAAGGAGACA GACTTTGTGC CTTCCTCTGT 4001ATACCGGAAG GAACCTCGAC GTTAACTTGT TTATTGCAGC TTATAATGGT TACAAATAAA GCAATAGCAT CACAAATTTC TATGGCCTTC CTTGGAGCTG CAATTGAACA AATAACGTCG AATATTACCA ATGTTTATTT CGTTATCGTA GTGTTTAAAG ACAAATAAAG CATTTATTAC TGTTTATTTC GTAAATAATG 4101CCTGTTATCC CTAGAATTCA CTGGCCGTCG TTTTACAACG TCGTGACTGG GAAAACCCTG GCGTTACCCA ACTTAATCGC GGACAATAGG GATCTTAAGT GACCGGCAGC AAAATGTTGC AGCACTGACC CTTTTGGGAC CGCAATGGGT TGAATTAGCG CTTGCAGCAC ATCCCCCTTT GAACGTCGTG TAGGGGGAAA 4201CGCCAGCTGG CGTAATAGCG AAGAGGCCCG CACCGATCGC CCTTCCCAAC AGTTGCGCAG CCTGAATGGC GAATGGCGCC GCGGTCGACC GCATTATCGC TTCTCCGGGC GTGGCTAGCG GGAAGGGTTG TCAACGCGTC GGACTTACCG CTTACCGCGG TGATGCGGTA TTTTCTCCTT ACTACGCCAT AAAAGAGGAA 4301ACGCATCTGT GCGGTATTTC ACACCGCATA CGTCAAAGCA ACCATAGTAC GCGCCCTGTA GCGGCGCATT AAGCGCGGCG TGCGTAGACA CGCCATAAAG TGTGGCGTAT GCAGTTTCGT TGGTATCATG CGCGGGACAT CGCCGCGTAA TTCGCGCCGC GGTGTGGTGG TTACGCGCAG CCACACCACC AATGCGCGTC 4401CGTGACCGCT ACACTTGCCA GCGCCCTAGC GCCCGCTCCT TTCGCTTTCT TCCCTTCCTT TCTCGCCACG TTCGCCGGCT GCACTGGCGA TGTGAACGGT CGCGGGATCG CGGGCGAGGA AAGCGAAAGA AGGGAAGGAA AGAGCGGTGC AAGCGGCCGA TTCCCCGTCA AGCTCTAAAT AAGGGGCAGT TCGAGATTTA 4501CGGGGGCTCC CTTTAGGGTT CCGATTTAGT GCTTTACGGC ACCTCGACCC CAAAAAACTT GATTTGGGTG ATGGTTCACG GCCCCCGAGG GAAATCCCAA GGCTAAATCA CGAAATGCCG TGGAGCTGGG GTTTTTTGAA CTAAACCCAC TACCAAGTGC TAGTGGGCCA TCGCCCTGAT ATCACCCGGT AGCGGGACTA 4601AGACGGTTTT TCGCCCTTTG ACGTTGGAGT CCACGTTCTT TAATAGTGGA CTCTTGTTCC AAACTGGAAC AACACTCAAC TCTGCCAAAA AGCGGGAAAC TGCAACCTCA GGTGCAAGAA ATTATCACCT GAGAACAAGG TTTGACCTTG TTGTGAGTTG CCTATCTCGG GCTATTCTTT GGATAGAGCC CGATAAGAAA 4701TGATTTATAA GGGATTTTGC CGATTTCGGC CTATTGGTTA AAAAATGAGC TGATTTAACA AAAATTTAAC GCGAATTTTA ACTAAATATT CCCTAAAACG GCTAAAGCCG GATAACCAAT TTTTTACTCG ACTAAATTGT TTTTAAATTG CGCTTAAAAT ACAAAATATT AACGTTTACA TGTTTTATAA TTGCAAATGT 4801ATTTTATGGT GCACTCTCAG TACAATCTGC TCTGATGCCG CATAGTTAAG CCAGCCCCGA CACCCGCCAA CACCCGCTGA TAAAATACCA CGTGAGAGTC ATGTTAGACG AGACTACGGC GTATCAATTC GGTCGGGGCT GTGGGCGGTT GTGGGCGACT CGCGCCCTGA CGGGCTTGTC GCGCGGGACT GCCCGAACAG 4901TGCTCCCGGC ATCCGCTTAC AGACAAGCTG TGACCGTCTA GACGAAAGGG CCTCGTGATA CGCCTATTTT TATAGGTTAA ACGAGGGCCG TAGGCGAATG TCTGTTCGAC ACTGGCAGAT CTGCTTTCCC GGAGCACTAT GCGGATAAAA ATATCCAATT TGTCATGATA ATAATGGTTT ACAGTACTAT TATTACCAAA 5001CTTAGACGTC AGGTGGCACT TTTCGGGGAA ATGTGCGCGG AACCCCTATT TGTTTATTTT TCTAAATACA TTCAAATATG GAATCTGCAG TCCACCGTGA AAAGCCCCTT TACACGCGCC TTGGGGATAA ACAAATAAAA AGATTTATGT AAGTTTATAC TATCCGCTCA TGAGACAATA ATAGGCGAGT ACTCTGTTAT 5101ACCCTGATAA ATGCTTCAAT AATATTGAAA AAGGAAGAGT ATGAGTATTC AACATTTCCG TGTCGCCCTT ATTCCCTTTT TGGGACTATT TACGAAGTTA TTATAACTTT TTCCTTCTCA TACTCATAAG TTGTAAAGGC ACAGCGGGAA TAAGGGAAAA TTGCGGCATT TTGCCTTCCT AACGCCGTAA AACGGAAGGA 5201GTTTTTGCTC ACCCAGAAAC GCTGGTGAAA GTAAAAGATG CTGAAGATCA GTTGGGTGCA CGAGTGGGTT ACATCGAACT CAAAAACGAG TGGGTCTTTG CGACCACTTT CATTTTCTAC GACTTCTAGT CAACCCACGT GCTCACCCAA TGTAGCTTGA GGATCTCAAC AGCGGTAAGA CCTAGAGTTG TCGCCATTCT 5301TCCTTGAGAG TTTTCGCCCC GAAGAACGTT TTCCAATGAT GAGCACTTTT AAAGTTCTGC TATGTGGCGC GGTATTATCC AGGAACTCTC AAAAGCGGGG CTTCTTGCAA AAGGTTACTA CTCGTGAAAA TTTCAAGACG ATACACCGCG CCATAATAGG CGTATTGACG CCGGGCAAGA GCATAACTGC GGCCCGTTCT 5401GCAACTCGGT CGCCGCATAC ACTATTCTCA GAATGACTTG GTTGAGTACT CACCAGTCAC AGAAAAGCAT CTTACGGATG CGTTGAGCCA GCGGCGTATG TGATAAGAGT CTTACTGAAC CAACTCATGA GTGGTCAGTG TCTTTTCGTA GAATGCCTAC GCATGACAGT AAGAGAATTA CGTACTGTCA TTCTCTTAAT 5501TGCAGTGCTG CCATAACCAT GAGTGATAAC ACTGCGGCCA ACTTACTTCT GACAACGATC GGAGGACCGA AGGAGCTAAC ACGTCACGAC GGTATTGGTA CTCACTATTG TGACGCCGGT TGAATGAAGA CTGTTGCTAG CCTCCTGGCT TCCTCGATTG CGCTTTTTTG CACAACATGG GCGAAAAAAC GTGTTGTACC 5601GGGATCATGT AACTCGCCTT GATCGTTGGG AACCGGAGCT GAATGAAGCC ATACCAAACG ACGAGCGTGA CACCACGATG CCCTAGTACA TTGAGCGGAA CTAGCAACCC TTGGCCTCGA CTTACTTCGG TATGGTTTGC TGCTCGCACT GTGGTGCTAC CCTGTAGCAA TGGCAACAAC GGACATCGTT ACCGTTGTTG 5701GTTGCGCAAA CTATTAACTG GCGAACTACT TACTCTAGCT TCCCGGCAAC AATTAATAGA CTGGATGGAG GCGGATAAAG CAACGCGTTT GATAATTGAC CGCTTGATGA ATGAGATCGA AGGGCCGTTG TTAATTATCT GACCTACCTC CGCCTATTTC TTGCAGGACC ACTTCTGCGC AACGTCCTGG TGAAGACGCG 5801TCGGCCCTTC CGGCTGGCTG GTTTATTGCT GATAAATCTG GAGCCGGTGA GCGTGGGTCT CGCGGTATCA TTGCAGCACT AGCCGGGAAG GCCGACCGAC CAAATAACGA CTATTTAGAC CTCGGCCACT CGCACCCAGA GCGCCATAGT AACGTCGTGA GGGGCCAGAT GGTAAGCCCT CCCCGGTCTA CCATTCGGGA 5901CCCGTATCGT AGTTATCTAC ACGACGGGGA GTCAGGCAAC TATGGATGAA CGAAATAGAC AGATCGCTGA GATAGGTGCC GGGCATAGCA TCAATAGATG TGCTGCCCCT CAGTCCGTTG ATACCTACTT GCTTTATCTG TCTAGCGACT CTATCCACGG TCACTGATTA AGCATTGGTA AGTGACTAAT TCGTAACCAT 6001ACTGTCAGAC CAAGTTTACT CATATATACT TTAGATTGAT TTAAAACTTC ATTTTTAATT TAAAAGGATC TAGGTGAAGA TGACAGTCTG GTTCAAATGA GTATATATGA AATCTAACTA AATTTTGAAG TAAAAATTAA ATTTTCCTAG ATCCACTTCT TCCTTTTTGA TAATCTCATG AGGAAAAACT ATTAGAGTAC 6101ACCAAAATCC CTTAACGTGA GTTTTCGTTC CACTGAGCGT CAGACCCCGT AGAAAAGATC AAAGGATCTT CTTGAGATCC TGGTTTTAGG GAATTGCACT CAAAAGCAAG GTGACTCGCA GTCTGGGGCA TCTTTTCTAG TTTCCTAGAA GAACTCTAGG TTTTTTTCTG CGCGTAATCT AAAAAAAGAC GCGCATTAGA 6201GCTGCTTGCA AACAAAAAAA CCACCGCTAC CAGCGGTGGT TTGTTTGCCG GATCAAGAGC TACCAACTCT TTTTCCGAAG CGACGAACGT TTGTTTTTTT GGTGGCGATG GTCGCCACCA AACAAACGGC CTAGTTCTCG ATGGTTGAGA AAAAGGCTTC GTAACTGGCT TCAGCAGAGC CATTGACCGA AGTCGTCTCG 6301GCAGATACCA AATACTGTCC TTCTAGTGTA GCCGTAGTTA GGCCACCACT TCAAGAACTC TGTAGCACCG CCTACATACC CGTCTATGGT TTATGACAGG AAGATCACAT CGGCATCAAT CCGGTGGTGA AGTTCTTGAG ACATCGTGGC GGATGTATGG TCGCTCTGCT AATCCTGTTA AGCGAGACGA TTAGGACAAT 6401CCAGTGGCTG CTGCCAGTGG CGATAAGTCG TGTCTTACCG GGTTGGACTC AAGACGATAG TTACCGGATA AGGCGCAGCG GGTCACCGAC GACGGTCACC GCTATTCAGC ACAGAATGGC CCAACCTGAG TTCTGCTATC AATGGCCTAT TCCGCGTCGC GTCGGGCTGA ACGGGGGGTT CAGCCCGACT TGCCCCCCAA 6501CGTGCACACA GCCCAGCTTG GAGCGAACGA CCTACACCGA ACTGAGATAC CTACAGCGTG AGCTATGAGA AAGCGCCACG GCACGTGTGT CGGGTCGAAC CTCGCTTGCT GGATGTGGCT TGACTCTATG GATGTCGCAC TCGATACTCT TTCGCGGTGC CTTCCCGAAG GGAGAAAGGC GAAGGGCTTC CCTCTTTCCG 6601GGACAGGTAT CCGGTAAGCG GCAGGGTCGG AACAGGAGAG CGCACGAGGG AGCTTCCAGG GGGAAACGCC TGGTATCTTT CCTGTCCATA GGCCATTCGC CGTCCCAGCC TTGTCCTCTC GCGTGCTCCC TCGAAGGTCC CCCTTTGCGG ACCATAGAAA ATAGTCCTGT CGGGTTTCGC TATCAGGACA GCCCAAAGCG 6701CACCTCTGAC TTGAGCGTCG ATTTTTGTGA TGCTCGTCAG GGGGGCGGAG CCTATGGAAA AACGCCAGCA ACGCGGCCTT GTGGAGACTG AACTCGCAGC TAAAAACACT ACGAGCAGTC CCCCCGCCTC GGATACCTTT TTGCGGTCGT TGCGCCGGAA TTTACGGTTC CTGGCCTTTT AAATGCCAAG GACCGGAAAA 6801GCTGGCCTTT TGCTCACATG TTCTTTCCTG CGTTATCCCC TGATTCTGTG GATAACCGTA TTACCGCCTT TGAGTGAGCT CGACCGGAAA ACGAGTGTAC AAGAAAGGAC GCAATAGGGG ACTAAGACAC CTATTGGCAT AATGGCGGAA ACTCACTCGA GATACCGCTC GCCGCAGCCG CTATGGCGAG CGGCGTCGGC 6901AACGACCGAG CGCAGCGAGT CAGTGAGCGA GGAAGCGGAA GTTGCTGGCTC GCGTCGCTCA GTCACTCGCT CCTTCGCCTT C

1.-15. (canceled)
 16. A method for detecting an antigen of interest in asample, comprising the steps of (a) contacting the sample with anantibody that specifically binds the antigen under conditions thatpromote the formation of an antibody-antigen complex, (b) contacting theantibody-antigen complex with a fusion protein comprising (i) theimmunoglobulin-binding domains of staphylococcal protein A andstreptococcal protein G, and (ii) Metridia longa luciferase or aderivative lacking the N-terminal region, under conditions that promotebinding of the fusion protein to the antibody-antigen complex, and (c)detecting the Metridia longa luciferase.
 17. The method of claim 16,wherein the fusion protein is encoded by a vector selected from thegroup consisting of pS14L-spAG-MLuc16, pETspAG-ΔN-MLuc1, andpS14L-spAG-ΔN-MLuc15.
 18. The method of claim 17, wherein the fusionprotein is encoded by pS14L-spAG-MLuc16 or pETspAG-ΔN-MLuc1.
 19. Themethod of claim 17, wherein the fusion protein is encoded bypS14L-spAG-ΔN-MLuc15.
 20. An IgG fusion protein comprising IgG heavychains fused with a peptide or polypeptide selected from the groupconsisting of green fluorescent protein (GFP), Metridia longaluciferase, cellulose binding domain, 6× histidine, or a biotinylatablepeptide.